首页 > 解决方案 > 来自 s3 存储桶的 Pandas read_pickle

问题描述

我正在使用AWS EMR的Jupyter笔记本。

我能够做到这一点: pd.read_csv("s3:\\mypath\\xyz.csv')

但是,如果我尝试打开这样的泡菜文件,pd.read_pickle("s3:\\mypath\\xyz.pkl")

我收到此错误:

[Errno 2] No such file or directory: 's3://pvarma1/users/users/candidate_users.pkl'
Traceback (most recent call last):
  File "/usr/local/lib64/python2.7/site-packages/pandas/io/pickle.py", line 179, in read_pickle
    return try_read(path)
  File "/usr/local/lib64/python2.7/site-packages/pandas/io/pickle.py", line 177, in try_read
    lambda f: pc.load(f, encoding=encoding, compat=True))
  File "/usr/local/lib64/python2.7/site-packages/pandas/io/pickle.py", line 146, in read_wrapper
    is_text=False)
  File "/usr/local/lib64/python2.7/site-packages/pandas/io/common.py", line 421, in _get_handle
    f = open(path_or_buf, mode)
IOError: [Errno 2] No such file or d

但是,我可以在同一条路径上看到xyz.csv两者xyz.pkl!任何人都可以帮忙吗?

标签: pythonpandasamazon-web-servicesamazon-s3

解决方案


Pandasread_pickle仅支持本地路径,与read_csv. 所以你应该先把pickle文件复制到你的机器上,然后再用pandas读取它。


推荐阅读