首页 > 解决方案 > 多处理解析数据到 sqlite 时出错

问题描述

我正在尝试解析一堆链接并将解析后的数据附加到 sqlite3。我收到 sqlite3 数据库被锁定的错误,所以可能是因为我使用了太高的池值?我试图将其降低到 5,但我仍然收到如下所示的错误。

我的代码基本上是这样的:

from multiprocessing import Pool

with Pool(5) as p:
    p.map(parse_link, links)

我的真实代码是这样的:

with Pool(5) as p:
    p.map(Get_FT_OU, file_to_set('links.txt'))
    # Where Get_FT_OU(link) appends links to a sqlite3 database.

当代码运行时,我经常会遇到这些错误。有人可以帮我解决吗?

    multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "/Users/christian/Documents/GitHub/odds/CP_Parser.py", line 166, in Get_FT_OU
    cursor.execute(sql_str)
sqlite3.OperationalError: database is locked
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/christian/Documents/GitHub/odds/CP_Parser.py", line 206, in <module>
    p.map(Get_FT_OU, file_to_set('links.txt'))
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 266, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 644, in get
    raise self._value
sqlite3.OperationalError: database is locked
>>> 

我可以在不使用多处理的情况下很好地运行代码,实际上在使用 Pool(2) 时我也没有收到任何错误,但是如果我走得更高,我会收到这些错误。我正在使用最新的 MacBook Air。

标签: pythonpython-3.xweb-scrapingsqlitemultiprocessing

解决方案


它通过向连接添加 timeout=10 以某种方式起作用

conn = sqlite3.connect(DB_FILENAME, timeout=10)

推荐阅读