python - 使用数据集访问 sqlite 的资源限制
问题描述
如何映射在内部访问大型数组上的 SQLite 数据的函数?
应用它〜128次后,它似乎对我不利。
next(iter(tbl))['value']
如果替换为固定值,则代码运行。因此,它似乎不是连接 ( dataset.connect(...)
) 或表 ( tbl=c['table']
) 对象的构造的资源问题,而是从数据库中获取值的一些泄漏。
注意:我使用的是奇数list(map(modify, data))
构造,因为我的实际用例是应用这个访问 Spark RDD 上的数据库的函数。这是我的问题的“普通python”等价物。
测试用例:
import dataset
import numpy
fname = '/tmp/test.db'
def ensure_db():
if not os.path.exists(fname):
c = ez_connection()
tbl = c['table']
tbl.insert({'value':1.0})
os.chmod(fname, 0o777) # this is a left-over when I thought file permissions might be a problem
assert(os.path.exists(fname))
def ez_connection():
return dataset.connect('sqlite:///'+fname)
def modify(value):
with ez_connection() as c:
tbl = c['table']
val = next(iter(tbl))['value'] # easy way to get the value out
return val+value
if __name__ == "__main__":
ensure_db()
for i in range(4, 2048, 4):
data = numpy.arange(i)
print(f"about to map {i} items ...", end=' ')
res = list(map(modify, data))
print('OK')
产生输出:
about to map 4 items ... OK
about to map 8 items ... OK
about to map 12 items ... OK
about to map 16 items ... OK
about to map 20 items ... OK
about to map 24 items ... OK
about to map 28 items ... OK
about to map 32 items ... OK
about to map 36 items ... OK
about to map 40 items ... OK
about to map 44 items ... OK
about to map 48 items ... OK
about to map 52 items ... OK
about to map 56 items ... OK
about to map 60 items ... OK
about to map 64 items ... Traceback (most recent call last):
File "/home/Dave/pyspark-env/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 3212, in _wrap_pool_connect
File "/home/Dave/pyspark-env/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 307, in connect
File "/home/Dave/pyspark-env/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 767, in _checkout
File "/home/Dave/pyspark-env/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 425, in checkout
File "/home/Dave/pyspark-env/lib/python3.8/site-packages/sqlalchemy/pool/impl.py", line 256, in _do_get
File "/home/Dave/pyspark-env/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 253, in _create_connection
File "/home/Dave/pyspark-env/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 368, in __init__
File "/home/Dave/pyspark-env/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 611, in __connect
File "/home/Dave/pyspark-env/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
File "/home/Dave/pyspark-env/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 207, in raise_
File "/home/Dave/pyspark-env/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 605, in __connect
File "/home/Dave/pyspark-env/lib/python3.8/site-packages/sqlalchemy/engine/create.py", line 578, in connect
File "/home/Dave/pyspark-env/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 584, in connect
sqlite3.OperationalError: unable to open database file
解决方案
推荐阅读
- php - 使用 PHP 在数组中输入/覆盖信息
- java - 使用重复键解析 Json
- c# - 如何通过 StructureMap 中的 ExplicitArguments 类型选择构造函数?
- linux - 使用 grep 正则表达式搜索多行模式
- c# - 无法在我的网格中选择行
- vue.js - Vue - 从字符串中渲染一个元素
- angularjs - 使用 $index 的动态 ng-repeat 字段
- docker - 如何在 Dockerfile 中创建 solr core 并将本地目录添加到容器 /opt/solr 目录
- java - Play framework 2.5 - 与客户端的通信
- googletest - gmock:测试两个浮点向量