pandas - 在我自己的函数中使用 pandas df.rolling() 时遇到问题
问题描述
我有一个带有两列的熊猫数据框 raw_data:“T”和“BP”:
T BP
0 -0.500 115.790
1 -0.499 115.441
2 -0.498 115.441
3 -0.497 115.441
4 -0.496 115.790
... ... ...
647163 646.663 105.675
647164 646.664 105.327
647165 646.665 105.327
647166 646.666 105.327
647167 646.667 104.978
[647168 rows x 2 columns]
我想在滚动窗口上应用 Hodges-Lehmann 平均值(这是一个稳健的平均值)并创建一个新列。这是功能:
def hodgesLehmannMean(x):
m = np.add.outer(x, x)
ind = np.tril_indices(len(x), 0)
return 0.5 * np.median(m[ind])
因此,我写道:
raw_data[new_col] = raw_data['BP'].rolling(21, min_periods=1, center=True,
win_type=None, axis=0, closed=None).agg(hodgesLehmannMean)
但我收到一串错误消息:
Traceback (most recent call last):
File "C:\Users\tkpme\miniconda3\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\tkpme\miniconda3\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "c:\Users\tkpme\.vscode\extensions\ms-python.python-2020.8.101144\pythonFiles\lib\python\debugpy\__main__.py", line 45, in <module>
cli.main()
File "c:\Users\tkpme\.vscode\extensions\ms-python.python-2020.8.101144\pythonFiles\lib\python\debugpy/..\debugpy\server\cli.py", line 430, in main
run()
File "c:\Users\tkpme\.vscode\extensions\ms-python.python-2020.8.101144\pythonFiles\lib\python\debugpy/..\debugpy\server\cli.py", line 267, in run_file
runpy.run_path(options.target, run_name=compat.force_str("__main__"))
File "C:\Users\tkpme\miniconda3\lib\runpy.py", line 265, in run_path
return _run_module_code(code, init_globals, run_name,
File "C:\Users\tkpme\miniconda3\lib\runpy.py", line 97, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "C:\Users\tkpme\miniconda3\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "c:\Users\tkpme\OneDrive\Documents\Work\CMC\BP Satya and Suresh\Code\Naveen_peak_detect test.py", line 227, in <module>
main()
File "c:\Users\tkpme\OneDrive\Documents\Work\CMC\BP Satya and Suresh\Code\Naveen_peak_detect test.py", line 75, in main
raw_data[new_col] = raw_data['BP'].rolling(FILTER_WINDOW, min_periods=1, center=True, win_type=None,
File "C:\Users\tkpme\miniconda3\lib\site-packages\pandas\core\window\rolling.py", line 1961, in aggregate
return super().aggregate(func, *args, **kwargs)
File "C:\Users\tkpme\miniconda3\lib\site-packages\pandas\core\window\rolling.py", line 523, in aggregate
return self.apply(func, raw=False, args=args, kwargs=kwargs)
File "C:\Users\tkpme\miniconda3\lib\site-packages\pandas\core\window\rolling.py", line 1987, in apply
return super().apply(
File "C:\Users\tkpme\miniconda3\lib\site-packages\pandas\core\window\rolling.py", line 1300, in apply
return self._apply(
File "C:\Users\tkpme\miniconda3\lib\site-packages\pandas\core\window\rolling.py", line 507, in _apply
result = calc(values)
File "C:\Users\tkpme\miniconda3\lib\site-packages\pandas\core\window\rolling.py", line 495, in calc
return func(x, start, end, min_periods)
File "C:\Users\tkpme\miniconda3\lib\site-packages\pandas\core\window\rolling.py", line 1326, in apply_func
return window_func(values, begin, end, min_periods)
File "pandas\_libs\window\aggregations.pyx", line 1375, in pandas._libs.window.aggregations.roll_generic_fixed
File "c:\Users\tkpme\OneDrive\Documents\Work\CMC\BP Satya and Suresh\Code\Naveen_peak_detect test.py", line 222, in hodgesLehmannMean
m = np.add.outer(x, x)
File "C:\Users\tkpme\miniconda3\lib\site-packages\pandas\core\series.py", line 705, in __array_ufunc__
return construct_return(result)
File "C:\Users\tkpme\miniconda3\lib\site-packages\pandas\core\series.py", line 694, in construct_return
raise NotImplementedError
NotImplementedError
这似乎是由线驱动的
m = np.add.outer(x, x)
并指出某些未实施或缺少 numpy 的内容。但是我在一开始就导入了numpy,如下所示:
import numpy as np
import pandas as pd
如果我向它提供一个列表或一个 numpy 数组,该函数本身就可以很好地工作,所以我不确定问题是什么。有趣的是,如果我使用中位数而不是 Hodges-Lehmann 均值,它运行起来就像一个魅力
raw_data[new_col] = raw_data['BP'].rolling(21, min_periods=1, center=True,
win_type=None, axis=0, closed=None).median()
我的问题的原因是什么,我该如何解决?
真挚地
托马斯飞利浦
解决方案
我已经用一个小数据框尝试了你的代码,它运行良好,所以你的数据框上可能有一些必须清理或转换的东西。
推荐阅读
- javascript - 成员函数如何访问您的对象?
- ibm-watson - 有没有办法重命名 IBM Watson 意图?
- python - python web-crawling,requests.post在服务器环境中不返回任何内容
- php - 操作字符串以将其转换为具有不同格式的另一个字符串
- matlab - 为什么在 kfoldPredict 的分数中找到 NaN 值?
- c# - 为什么当我尝试序列化当前视图模型 WPF 应用程序以便稍后保存状态时,它应该被反消毒,我得到 StackOverflowException?
- javascript - 双重返回是有效的语法吗?
- python - 信任笔记本在 PyCharm 中不起作用
- php - 数组导入器上的PHP数据规范化?
- python - 如何在 Flask Python 中使用 2 个表调用 db.session.query 中的列名?