首页 > 解决方案 > 使用从 Pandas DataFrame 派生的类的 pickle 丢失对象属性

问题描述

我正在尝试更新派生类的酸洗行为,但我无法将对象属性保存在酸洗中。

import pandas as pd

class Fruit(pd.DataFrame): 
    def __init__(self):
        super(Fruit, self).__init__()
        self.codebook={3:4}

sdf = Fruit()

sdf.codebook ={1:2}

import pickle

with open('mypickle.pickle', 'wb') as f:
    pickle.dump(sdf, f)
with open('mypickle.pickle') as f:
    loaded_obj = pickle.load(f)        
assert  loaded_obj.__class__ == Fruit
assert loaded_obj.codebook

最后的断言失败。AttributeError: 'Fruit' object has no attribute 'codebook'

如果我的类定义是,class Fruit(): pass那么它可以正常工作。

如何腌制我的对象并包含新的“点”属性 ( .codebook)?

(这个问题与一个开放项目有关,旨在将调查数据码本信息和功能添加到 Pandas DataFrame 类:https ://github.com/cpbl/surveypandas )

更新:

我试过用 Fruit's own 替换 DataFrame.to_pickle,它直接使用 pickle.dump(self),并且以同样的方式失败。(为什么?)

这是一个不漂亮的努力,它成功地记录了所有内容,但导致了 pandas.read_pickle() 无法理解的内容。这似乎也不必要地尴尬:

from pandas import DataFrame 
import cPickle as pkl

class Fruit(DataFrame): 
    def __init__(self, data=None, codebook =None):
        super(Fruit, self).__init__(data)
        self.codebook=codebook
    def to_pickle(self, path, compression='infer',
              protocol= pkl.HIGHEST_PROTOCOL):
        """ See pandas.io.pickle and pandas.DataFrame.to_pickle """
        from pandas.io.pickle import to_pickle as pandas_to_pickle
        return pandas_to_pickle( {'o': self, 'c':self.codebook}, path, compression=compression, protocol=protocol)

sdf = Fruit({'foo':[1,2,3]}, codebook={1:2})
sdf.to_pickle('mypickle.pickle')

with open('mypickle.pickle') as f:
    loaded_obj = pkl.load(f)
sdf2 = Fruit(loaded_obj['o'], codebook = loaded_obj['c'])

assert sdf2.__class__ == Fruit
assert sdf2.codebook

标签: pythonpandasclass

解决方案


推荐阅读