首页 > 解决方案 > Pandas Pivot 在一列上使用多索引

问题描述

我想基于一列旋转一个表,索引为两列,

数据集:

uid     interaction date
1       like        2016-12-04
1       like        2016-12-05
1       comment     2016-12-05
1       like        2016-12-05
2       like        2016-12-04
2       like        2016-12-05
2       comment     2016-12-05
2       like        2016-12-05

使用 uid 和日期我想在特定日期为特定 uid 发生的交互次数。

最后结果:

uid     like    comment  date
1       1       0       2016-12-04
1       2       1       2016-12-05
2       1       0       2016-12-04
2       2       1       2016-12-05      

我尝试过的方法:

doc_social_interaction.pivot_table(index = ['uid','date'],columns = 'interaction', aggfunc=sum)

标签: pythonpandas

解决方案


你很接近,需要GroupBy.size计数:

df1 = df.pivot_table(index=['uid','date'],columns='interaction',aggfunc='size',fill_value=0)

另一个解决方案:

df1 = df.groupby(['uid','date','interaction']).size().unstack(fill_value=0)

df1 = df.groupby(['uid','date'])['interaction'].value_counts().unstack(fill_value=0)

df1 = pd.crosstab([df['uid'],df['date']], df['interaction'])

print (df1)
interaction     comment  like
uid date                     
1   2016-12-04        0     1
    2016-12-05        1     2
2   2016-12-04        0     1
    2016-12-05        1     2

最后进行一些数据清理:

df1 = df1.reset_index().rename_axis(None, 1)
print (df1)
   uid        date  comment  like
0    1  2016-12-04        0     1
1    1  2016-12-05        1     2
2    2  2016-12-04        0     1
3    2  2016-12-05        1     2

推荐阅读