首页 > 解决方案 > python pandas rolling apply自定义函数返回keyError

问题描述

我正在尝试将自定义函数应用于滚动窗口,但它给出了 KeyError,我不确定为什么或如何修复它。我看过这里这里,但答案并没有解决问题。

这是我重现错误的代码:

import pandas as pd 
import numpy as np
from sklearn.feature_selection import chi2

def return_chi2(df):                                                               
    return chi2(df['signal'].to_numpy().reshape(len(df['signal'].index),1), 
        df['PnL_binary'].to_numpy().reshape(len(df['PnL_binary'].index),1))[1][0]

df = pd.DataFrame()
df['signal'] = [0,0,0,1,1,1,0,0,0,1]
df['PnL_binary'] = [0,0,0,1,1,1,0,0,0,0]

return_chi2(df)
>>>0.04953461343562649

到目前为止一切顺利,该函数工作并返回卡方。问题是将其应用于滚动窗口:

df.rolling(3).apply(return_chi2)

    return self._engine.get_value(s, k, tz=getattr(series.dtype, "tz", None))
  File "pandas\_libs\index.pyx", line 80, in pandas._libs.index.IndexEngine.get_value
  File "pandas\_libs\index.pyx", line 90, in pandas._libs.index.IndexEngine.get_value
  File "pandas\_libs\index.pyx", line 135, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\index_class_helper.pxi", line 109, in pandas._libs.index.Int64Engine._check_type
KeyError: 'signal'

我认为该错误与试图将“信号”查找为索引值而不是列的应用函数有关。我试过了:

df.rolling(3).apply(return_chi2, axis=1) 

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: apply() got an unexpected keyword argument 'axis'

不确定从这里去哪里?我可以使用 iterrows 之类的东西并手动滚动整个 df 切片窗口,但它似乎不是很pythonic - 应该有更好的方法来做到这一点?希望有任何帮助来实现这一目标?

标签: pythonpandasapply

解决方案


推荐阅读