python - 如何在 Python 3 中从 AWS 读取 Excel 文件?
问题描述
下面的脚本在 python 2 中运行没有任何错误,但在 python 3 中,我收到以下错误。
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-8-11eb1acac6cc> in <module>
16 response = s3client.get_object(Bucket='db_region_xyz' , Key='Aaa_bbb/Tester.xlsx')
17
---> 18 dataset = pd.read_excel(response['Body'])
/opt/anaconda3/lib/python3.7/site-packages/pandas/io/excel/_base.py in read_excel(io, sheet_name, header, names, index_col, usecols, squeeze, dtype, engine, converters, true_values, false_values, skiprows, nrows, na_values, keep_default_na, verbose, parse_dates, date_parser, thousands, comment, skipfooter, convert_float, mangle_dupe_cols, **kwds)
302
303 if not isinstance(io, ExcelFile):
--> 304 io = ExcelFile(io, engine=engine)
305 elif engine and engine != io.engine:
306 raise ValueError(
/opt/anaconda3/lib/python3.7/site-packages/pandas/io/excel/_base.py in __init__(self, io, engine)
819 self._io = stringify_path(io)
820
--> 821 self._reader = self._engines[engine](self._io)
822
823 def __fspath__(self):
/opt/anaconda3/lib/python3.7/site-packages/pandas/io/excel/_xlrd.py in __init__(self, filepath_or_buffer)
19 err_msg = "Install xlrd >= 1.0.0 for Excel support"
20 import_optional_dependency("xlrd", extra=err_msg)
---> 21 super().__init__(filepath_or_buffer)
22
23 @property
/opt/anaconda3/lib/python3.7/site-packages/pandas/io/excel/_base.py in __init__(self, filepath_or_buffer)
348 elif hasattr(filepath_or_buffer, "read"):
349 # N.B. xlrd.Book has a read attribute too
--> 350 filepath_or_buffer.seek(0)
351 self.book = self.load_workbook(filepath_or_buffer)
352 elif isinstance(filepath_or_buffer, str):
AttributeError: 'StreamingBody' object has no attribute 'seek'
我必须在下面的脚本中进行哪些更改才能在 python 版本 3 中运行?
import boto3
from io import StringIO
from boto3 import session
import pandas as pd
import numpy as np
session = boto3.session.Session(region_name='region_xyz')
s3client = session.client('s3' , config=boto3.session.Config(signature_version='s3v4'))
response = s3client.get_object(Bucket='db_region_xyz' , Key='Aaa_bbb/Tester.xlsx')
dataset = pd.read_excel(response['Body'])
问候
解决方案
dataset = pd.read_excel(io.BytesIO(response['Body'].read()))
推荐阅读
- r - 如何建立与文件的连接?
- python - 如何对齐大脑 MRI
- python - 在 pandas.Series 中创建一个转变
- python - Python:`logger.info` 和 `logging.info` 有什么区别?
- java - MySQL 和 Java (JDBC) 连接错误:用户访问被拒绝
- google-bigquery - 在 bigquery 中查询嵌套数据
- python - torch.nn.DataParallel 和 to(device) 不支持嵌套模块
- java - 使用spring boot jpa存储库时出现Stackoverflow错误
- python - 如何使用 xlsxwriter 添加工作表
- html - PhpStorm 代码折叠显示折叠 HTML 部分的第一行