首页 > 解决方案 > 获取每组熊猫的前 n 行

问题描述

我尝试groupby与 pandas 一起使用,但对 python 相当陌生,我似乎找不到解决方案

raw_data = {'Products': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C', 'C', 'C'], 
        'Month': ['201903', '201903', '201902', '201901', '201902', '201901', '201902', '201904','201903', '201902', '201904', '201903'], 
        'Sales': [4, 24, 31, 2, 3, 4, 24, 31, 2, 3, 2, 3]}
df = pd.DataFrame(raw_data, columns = ['Products', 'Month', 'Sales'])
df

数据看起来像这样

Products    Month   Sales
0   A           201903  4
1   A           201903  24
2   A           201902  31
3   A           201901  2
4   B           201902  3
5   B           201901  4
6   B           201902  24
7   C           201904  31
8   C           201903  2
9   C           201902  3
10  C           201904  2
11  C           201903  3

我需要,每个产品显示最近两个月的销售额总和,就像这样

Products    Months  Sales
A           201902  31
A           201903  28
B           201901  4
B           201902  27
C           201903  5
C           201904  33

如果一切格式不正确,我很抱歉,对 SO 来说还是新的

谢谢

标签: pythonpandasgroup-by

解决方案


这将做到:

(df.groupby(['Products', 'Month'], as_index=False)
   .sum()
   .sort_values(['Products', 'Sales'],
                ascending=(True,False))
   .groupby('Products')
   .head(2))

  Products   Month  Sales
1        A  201902     31
2        A  201903     28
4        B  201902     27
3        B  201901      4
7        C  201904     33
6        C  201903      5

推荐阅读