首页 > 解决方案 > 如何在 Python 3 中从 AWS 读取 Excel 文件?

问题描述

下面的脚本在 python 2 中运行没有任何错误,但在 python 3 中,我收到以下错误。

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-8-11eb1acac6cc> in <module>
     16 response = s3client.get_object(Bucket='db_region_xyz' , Key='Aaa_bbb/Tester.xlsx')
     17 
---> 18 dataset = pd.read_excel(response['Body'])
/opt/anaconda3/lib/python3.7/site-packages/pandas/io/excel/_base.py in read_excel(io, sheet_name, header, names, index_col, usecols, squeeze, dtype, engine, converters, true_values, false_values, skiprows, nrows, na_values, keep_default_na, verbose, parse_dates, date_parser, thousands, comment, skipfooter, convert_float, mangle_dupe_cols, **kwds)
    302 
    303     if not isinstance(io, ExcelFile):
--> 304         io = ExcelFile(io, engine=engine)
    305     elif engine and engine != io.engine:
    306         raise ValueError(
/opt/anaconda3/lib/python3.7/site-packages/pandas/io/excel/_base.py in __init__(self, io, engine)
    819         self._io = stringify_path(io)
    820 
--> 821         self._reader = self._engines[engine](self._io)
    822 
    823     def __fspath__(self):
/opt/anaconda3/lib/python3.7/site-packages/pandas/io/excel/_xlrd.py in __init__(self, filepath_or_buffer)
     19         err_msg = "Install xlrd >= 1.0.0 for Excel support"
     20         import_optional_dependency("xlrd", extra=err_msg)
---> 21         super().__init__(filepath_or_buffer)
     22 
     23     @property
/opt/anaconda3/lib/python3.7/site-packages/pandas/io/excel/_base.py in __init__(self, filepath_or_buffer)
    348         elif hasattr(filepath_or_buffer, "read"):
    349             # N.B. xlrd.Book has a read attribute too
--> 350             filepath_or_buffer.seek(0)
    351             self.book = self.load_workbook(filepath_or_buffer)
    352         elif isinstance(filepath_or_buffer, str):
AttributeError: 'StreamingBody' object has no attribute 'seek'

我必须在下面的脚本中进行哪些更改才能在 python 版本 3 中运行?

import boto3
from io import StringIO
from boto3 import session
import pandas as pd
import numpy as np

session = boto3.session.Session(region_name='region_xyz')
s3client = session.client('s3' , config=boto3.session.Config(signature_version='s3v4'))
response = s3client.get_object(Bucket='db_region_xyz' , Key='Aaa_bbb/Tester.xlsx')

dataset = pd.read_excel(response['Body'])

问候

标签: pythonamazon-s3

解决方案


dataset = pd.read_excel(io.BytesIO(response['Body'].read()))

推荐阅读