首页 > 解决方案 > Pandas 线性插值总是用相同的值替换

问题描述

我正在使用 Pandas 系列执行线性插值,但它似乎一直用相同的值替换 NaN,我不知道为什么..

# To check where are located the missing values : 
d1[d1.isna().any(axis=1)]

             volume   price
datetime
2018-05-23 09:00:00   NaN   0.0 
2018-05-22 11:00:00   NaN   0.0 
2018-05-21 12:00:00   NaN   0.0 
2018-05-21 10:00:00   NaN   0.0 
2018-05-21 09:00:00   NaN   0.0 
2018-05-18 09:00:00   NaN   0.0 

d1["price"].astype(float).interpolate(method="time")

datetime
2018-05-23 10:00:00    23.825000
2018-05-23 09:00:00    22.425000
2018-05-22 17:00:00    24.041000
                         ...
2018-05-22 12:00:00    23.975000
2018-05-22 11:00:00    22.425000
2018-05-22 10:00:00    24.000000
                         ...
2018-05-21 12:00:00    22.425000
2018-05-21 11:00:00    23.200000
2018-05-21 10:00:00    22.425000
2018-05-21 09:00:00    22.425000
                         ...
2018-05-18 10:00:00    23.425000
2018-05-18 09:00:00    22.425000
2018-05-17 17:00:00    23.516000

缺失值总是用 22.45 代替,这既不是系列的平均值也不是中位数。有人可以帮忙吗?谢谢 !

编辑:一瞥初始数据框:

                     price    volume
datetime 
2018-05-23 16:00:00  23.936667 70.0 
2018-05-23 15:00:00  24.040000 5.0 
2018-05-23 14:00:00  23.971875 185.0 
2018-05-23 13:00:00  23.811111 250.0 
2018-05-23 12:00:00  23.800000 240.0 
2018-05-23 11:00:00  23.816667 140.0 
2018-05-23 10:00:00  23.825000 10.0 
2018-05-23 09:00:00  NaN 0.0 
2018-05-22 17:00:00  24.041000 260.0 
2018-05-22 16:00:00  24.062857 150.0 
2018-05-22 15:00:00  24.031818 1525.0 
2018-05-22 14:00:00  24.079167 165.0 
2018-05-22 13:00:00  23.950000 5.0 
2018-05-22 12:00:00  23.975000 375.0 
2018-05-22 11:00:00  NaN 0.0 
2018-05-22 10:00:00  24.000000 30.0 
2018-05-22 09:00:00  24.000000 30.0 

标签: pythonpandastime-serieslinear-interpolation

解决方案


推荐阅读