首页 > 解决方案 > 每周一重采样

问题描述

我想按 id 和日期对每周(星期一)独立汇总的两列求和。.

df = pd.DataFrame({'id':['x2', 'x2', 'x1', 'x1', 'x1'],
            'date':['2021-01-03','2021-01-09', '2021-01-02', '2021-01-01', '2021-01-01'],
       'distance_europe':[100, 100, 200, 200, 100],
                'distance_USA': [0, 200, 100, 100, 3]})

预期产出

2020-12-28  x2  100   0
2020-12-28  x1 500   203
2021-01-04  x2  100  200

更新---- 我下面的解决方案工作正常---------

df = df.groupby('id').resample('W', on='date').sum().reset_index()
df['due_date'] = df.date - pd.Timedelta('6D')

标签: pythonpandaspandas-resample

解决方案


使用Grouperwithclosed='left'label='left'withid和 aggregate sum

df = pd.DataFrame({'id':['x2', 'x2', 'x1', 'x1', 'x1'],
            'date':['2021-01-03','2021-01-09', '2021-01-02', '2021-01-01', '2021-01-01'],
       'distance_europe':[100, 100, 200, 200, 100],
                'distance_USA': [0, 200, 100, 100, 3]})


df['date'] = pd.to_datetime(df['date'])

df = (df.groupby([pd.Grouper(key='date', freq='W-Mon', closed='left', label='left'), 'id'])
       .sum()
       .reset_index())
print (df)
        date  id  distance_europe  distance_USA
0 2020-12-28  x1              500           203
1 2020-12-28  x2              100             0
2 2021-01-04  x2              100           200

您的解决方案是使用与上述相同的参数进行分组的更改Sundays (for W is default Sundays)Mondays

df = df.groupby('id').resample('W-Mon', on='date', closed='left', label='left').sum().reset_index()

print (df)
   id       date  distance_europe  distance_USA
0  x1 2020-12-28              500           203
1  x2 2020-12-28              100             0
2  x2 2021-01-04              100           200

推荐阅读