首页 > 解决方案 > 如何以最小间隔将未排序的时间序列数据切割成箱?

问题描述

我有一个这样的数据框

x = pd.DataFrame({'a':[1.1341, 1.13421, 1.13433, 1.13412, 1.13435, 1.13447, 1.13459, 1.13452, 1.13471, 1.1348, 1.13496,1.13474,1.13483,1.1349,1.13502,1.13515,1.13526,1.13512]})

我们如何拆分这个系列以获得以下输出,使得最小差异至少为 0.0005

x['output'] =  [1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0]

在此处输入图像描述

标签: pythonpandastime-series

解决方案


I don't believe there is a vectorized way to do this, so you probably need to loop through the values.

x = x.assign(output=0)  # Initialize all the output values to zero.
x['output'].iat[0] = 1
threshold = 0.0005
prior_val = x['a'].iat[0]
for n, val in enumerate(x['a']):
    if abs(val - prior_val) >= threshold:
        x['output'].iat[n] = 1
        prior_val = val  # Reset to new value found that exceeds threshold.

推荐阅读