首页 > 解决方案 > Python bz2 在读取整个文件之前返回 EOFerror

问题描述

我正在尝试从 Zenodo 中的压缩文件中延迟加载项目。我的目标是迭代地产生项目而不将文件存储在我的计算机中。我的问题是在读取第一个非空行后立即发生 EOFerror。我该如何克服这个问题?

这是我的代码:

import requests as req
import json
from bz2 import BZ2Decompressor


def lazy_load(file_url):
    dec = BZ2Decompressor()
    with req.get(file_url, stream=True) as res:
        for chunk in res.iter_content(chunk_size=1024):
            data = dec.decompress(chunk).decode('utf-8')
            # do something with 'data'


if __name__ == "__main__":
    creds = json.load(open('credentials.json'))
    url = 'https://zenodo.org/api/records/'
    id = '4617285'
    filename = '10.Papers.nt.bz2'
    res = req.get(f'{url}{id}', params={'access_token': creds['zenodo_token']})
    for file in res.json()['files']:
    if file['key'] == filename:
        for item in lazy_load(file['links']['self']):
            # do something with 'item'

我得到的错误如下:

Traceback (most recent call last):
File ".\mag_loader.py", line 51, in <module>
  for item in lazy_load(file['links']['self']):
File ".\mag_loader.py", line 18, in lazy_load
  data = dec.decompress(chunk)
EOFError: End of stream already reache

要运行代码,您需要一个 Zenodo 访问令牌,您需要一个帐户。登录后,您可以在此处创建令牌:https ://zenodo.org/account/settings/applications/tokens/new/

标签: pythonpython-requestsstreamlazy-loadingbz2

解决方案


推荐阅读