首页 > 解决方案 > Using pivot / aggregation in dataframe

问题描述

have a data frame with columns name value and date in format of dd

name  value date

mark  200  1   
john  300  1
mark  200  2
mark  200  2     
mark  300  2 
john  300  3
john  400  2

using pivot and aggregation need to convert into this using pandas

date  name   count(date)  value
1     mark       1         200
2     mark       3         700
1     john       1         300
2     john       2         300
3     john       1         400

标签: pythonpandasdataframegroup-bypivot

解决方案


GroupBy.agg与元组列表中的聚合函数一起使用:

df1 = (df.groupby(['date','name'])['value']
         .agg([('count', 'size'), ('value','sum')])
         .reset_index())
print (df1)
   date  name  count  value
0     1  john      1    300
1     1  mark      1    200
2     2  john      1    400
3     2  mark      3    700
4     3  john      1    300

在 pandas 0.25+ 中使用命名聚合的另一个解决方案:

df1 = (df.groupby(['date','name'])
         .agg(count=('date', 'size'), value= ('value','sum'))
         .reset_index())
print (df1)
   date  name  count  value
0     1  john      1    300
1     1  mark      1    200
2     2  john      1    400
3     2  mark      3    700
4     3  john      1    300

推荐阅读