首页 > 解决方案 > 使用 Python 将大型 zip 文件上传到网站

问题描述

我有以下问题:我需要将大型 .zip 文件(通常 >500MB,最大 ca 5GB)上传到网站,然后由该网站处理这些文件。我Python 2.7.16在 Windows 32 位上执行此操作。遗憾的是,由于公司限制,我无法更改我的设置(从 32 位到 64 位),也无法安装其他 Python 插件(我有请求、urllib 和 urllib2 以及其他几个安装)。我的代码现在看起来像这样:

 import requests

 FileList=["C:\File01.zip", "C:\FileA02.zip", "C:\UserFile993.zip"]
 UploadURL = "https://mywebsite.com/submitFile"
 for FilePath in FileList:
    print("Upload file: "+str(FilePath))
    session = requests.Session()
        with open(FilePath, "rb") as file:
        session.post(UploadURL,data={'file':'Send file'},files={'FileToBeUploaded':FilePath})
    print("Upload done: "+str(FilePath))
    session.close()

由于我FileList的内容很长(> 100 个条目),我只是在此处粘贴了一段摘录。如果有低于 600MB 的文件,上面的代码运行良好。上面的任何文件都会给我这个错误:

  File "<stdin>", line 1, in <module>
  File "C:\Users\AAA253\Desktop\DingDong.py", line 39, in <module>
    session.post(UploadURL,data={'file':'Send file'},files={'FileToBeUploaded':FilePath})
  File "C:\Python27\lib\site-packages\requests\sessions.py", line 522, in post
    return self.request('POST', url, data=data, json=json, **kwargs)
  File "C:\Python27\lib\site-packages\requests\sessions.py", line 461, in request
    prep = self.prepare_request(req)
  File "C:\Python27\lib\site-packages\requests\sessions.py", line 394, in prepare_request
    hooks=merge_hooks(request.hooks, self.hooks),
  File "C:\Python27\lib\site-packages\requests\models.py", line 297, in prepare
    self.prepare_body(data, files, json)
  File "C:\Python27\lib\site-packages\requests\models.py", line 455, in prepare_body
    (body, content_type) = self._encode_files(files, data)
  File "C:\Python27\lib\site-packages\requests\models.py", line 158, in _encode_files
    body, content_type = encode_multipart_formdata(new_fields)
  File "C:\Python27\lib\site-packages\requests\packages\urllib3\filepost.py", line 86, in encode_multipart_formdata
    body.write(data)
MemoryError

我已经在这里查看了论坛以找到一些解决方案,但遗憾的是我找不到任何合适的解决方案。有人知道如何完成这项工作吗?可以通过分块加载文件来制作吗?如果是这样,如何分块上传文件,使服务器不“取消”操作?

编辑:使用@AKX 的答案,我使用以下代码:

import requests
from requests_toolbelt.multipart import encoder

FileList=["C:\File01.zip", "C:\FileA02.zip", "C:\UserFile993.zip"]
UploadURL = "https://mywebsite.com/submitFile"
for FilePath in FileList:
    session = requests.Session()
    with open(FilePath, 'rb') as f:
        form = encoder.MultipartEncoder({"documents": (FilePath, f, "application/octet-stream"),"composite": "NONE",})
        headers = {"Prefer": "respond-async", "Content-Type": form.content_type}
        resp = session.post(UploadURL,data={'file':'Send file'},files={'FileToBeUploaded':form})
    session.close()

尽管如此,我得到了几乎相同的错误:

    File "<stdin>", line 1, in <module>
  File "C:\Users\AAA253\Desktop\DingDong.py", line 48, in <module>
    resp =  session.post(UploadURL,data={'file':'Send file'},files={'FileToBeUploaded':form})
  File "C:\Python27\lib\site-packages\requests-2.24.0-py2.7.egg\requests\sessions.py", line 578, in post
    return self.request('POST', url, data=data, json=json, **kwargs)
  File "C:\Python27\lib\site-packages\requests-2.24.0-py2.7.egg\requests\sessions.py", line 516, in request
    prep = self.prepare_request(req)
  File "C:\Python27\lib\site-packages\requests-2.24.0-py2.7.egg\requests\sessions.py", line 459, in prepare_request
    hooks=merge_hooks(request.hooks, self.hooks),
  File "C:\Python27\lib\site-packages\requests-2.24.0-py2.7.egg\requests\models.py", line 317, in prepare
    self.prepare_body(data, files, json)
  File "C:\Python27\lib\site-packages\requests-2.24.0-py2.7.egg\requests\models.py", line 505, in prepare_body
    (body, content_type) = self._encode_files(files, data)
  File "C:\Python27\lib\site-packages\requests-2.24.0-py2.7.egg\requests\models.py", line 159, in _encode_files
    fdata = fp.read()
  File "build\bdist.win32\egg\requests_toolbelt\multipart\encoder.py", line 314, in read
  File "build\bdist.win32\egg\requests_toolbelt\multipart\encoder.py", line 194, in _load
  File "build\bdist.win32\egg\requests_toolbelt\multipart\encoder.py", line 256, in _write
  File "build\bdist.win32\egg\requests_toolbelt\multipart\encoder.py", line 552, in append
MemoryError

标签: pythonpython-2.7

解决方案


您很可能不需要requests-toolbelt流式 MultipartEncoder。

即使您的公司限制禁止安装新软件包,您也可以将requests_toolbelt您需要的部分(可能是整个软件包)供应到您的项目目录中。


推荐阅读