python - 如何通过 API 使用 For Loop 从 Google Drive 下载文件
问题描述
当我通过 api 在 Google Drive 上检索 csv 文件时,我得到的文件没有内容。
下面的代码由 3 部分组成(1:验证 2:搜索文件,3:下载文件)。
我怀疑第 3 步有问题:专门下载文件while done is False
,因为我在访问 Google Drive 和下载文件时没有问题。只是它们都是空文件。
如果有人能告诉我如何解决它,那就太好了。下面的代码大多是从谷歌网站借来的。提前感谢您的时间!
第 1 步:身份验证
from apiclient import discovery
from httplib2 import Http
import oauth2client
from oauth2client import file, client, tools
obj = lambda: None # this code allows for an empty class
auth = {"auth_host_name":'localhost', 'noauth_local_webserver':'store_true', 'auth_host_port':[8080, 8090], 'logging_level':'ERROR'}
for k, v in auth.items():
setattr(obj, k, v)
scopes = 'https://www.googleapis.com/auth/drive'
store = file.Storage('token_google_drive2.json')
creds = store.get()
# The following will takes a user to authentication link if no token file is found.
if not creds or creds.invalid:
flow = client.flow_from_clientsecrets('client_id.json', scopes)
creds = tools.run_flow(flow, store, obj)
第 2 步:搜索文件并创建要下载的文件字典
from googleapiclient.discovery import build
page_token = None
drive_service = build('drive', 'v3', credentials=creds)
while True:
name_list = []
id_list = []
response = drive_service.files().list(q="mimeType='text/csv' and name contains 'RR' and name contains '20191001'", spaces='drive',fields='nextPageToken, files(id, name)', pageToken=page_token).execute()
for file in response.get('files', []):
name = file.get('name')
id_ = file.get('id')
#name and id are strings, so create list first before creating a dictionary
name_list.append(name)
id_list.append(id_)
#also you need to remove ":" in name_list or you cannot download files - nowhere to be found in the folder!
name_list = [word.replace(':','') for word in name_list]
page_token = response.get('nextPageToken', None)
if page_token is None:
break
#### Create dictionary using name_list and id_list
zipobj = zip(name_list, id_list)
temp_dic = dict(zipobj)
第 3 步:下载文件(麻烦的部分)
import io
from googleapiclient.http import MediaIoBaseDownload
for i in range(len(temp_dic.values())):
file_id = list(temp_dic.values())[i]
v = list(temp_dic.keys())[i]
request = drive_service.files().get_media(fileId=file_id)
fh = io.FileIO(v, mode='w')
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
status_complete = int(status.progress()*100)
print(f'Download of {len(temp_dic.values())} files, {int(status.progress()*100)}%')
解决方案
其实我自己也想通了。下面是一个编辑。我需要做的就是删除done = False
while done is False:
并添加fh.close()
以关闭下载器。
完整修订的第 3 部分如下:
from googleapiclient.http import MediaIoBaseDownload
for i in range(len(temp_dic.values())):
file_id = list(temp_dic.values())[i]
v = list(temp_dic.keys())[i]
request = drive_service.files().get_media(fileId=file_id)
# replace the filename and extension in the first field below
fh = io.FileIO(v, mode='wb') #only in Windows, writing for binary is specified with wb
downloader = MediaIoBaseDownload(fh, request)
status, done = downloader.next_chunk()
status_complete = int(status.progress()*100)
print(f'{list(temp_dic.keys())[i]} is {int(status.progress()*100)}% downloaded')
fh.close()
print(f'{len(list(temp_dic.keys()))} files')
推荐阅读
- shopify - 如何根据单选按钮选择更改下拉选择?
- java - C# 或 Java 中的 Neo4J CYPHER:从“调用 db.schema.nodeTypeProperties()”返回 JSON 输出?
- javascript - 如何将对象设置为 Firebase Firestore - Javascript
- python - 日志空间中的 Python 直方图,其中一个 bin 以特定值为中心
- httpclient - HTTP-Client:'未在上下文中设置身份验证缓存'-它是什么?
- puppet - 如何在代理机器上检测 Puppet 代理故障“无法从远程服务器检索目录”
- symfony - 如何验证 EntityType 字段?
- c - Concorde 中的 KDTREE 可执行文件
- dictionary - 在 Ansible 中的两个单独的 dict 值之间使用条件
- objective-c - 增量读取 UIDocument 时使用哪个线程来执行异步文件访问使用块