首页 > 解决方案 > Interpolate proportionally with duplicate index

问题描述

I have a table like df = pd.DataFrame([1,np.nan,3,1,np.nan,3,50,np.nan,52], index=[7, 8, 9, 7, 12, 27, 7, 8, 9]):

index  values
7      1
8      NaN
9      3
7      1
12     NaN
27     3
7      50
8      NaN
9      52

Rows are correctly sorted. However, index here is not ordered, and has duplicates by design.

How to interpolate values here proportionally to index (method="index")?

If I try to interpolate using index, resulting Series is messed up because of duplicate index: df.interpolate(method='index'):

index  values  desired  actual
7      1       1        1
8      NaN     2        2
9      3       3        3
7      1       1        1
12     NaN     1.5      52   <-- wat
27     3       3        3
7      50      50       50
8      NaN     51       1.1  <-- wat
9      52      52       52

If not reproducible: Pandas 0.23.3, Numpy: 1.14.5, Python: 3.6.5

标签: pandasnumpy

解决方案


尝试根据索引添加分组数据框:

df.groupby(df.index.to_series().diff().lt(0).cumsum())\
  .apply(lambda x: x.interpolate(method='index'))

输出:

       0
7    1.0
8    2.0
9    3.0
7    1.0
12   1.5
27   3.0
7   50.0
8   51.0
9   52.0

推荐阅读