python - 实现基于流派分组购买的python代码

问题描述

我有下表（df）具有以下属性

 id      purchase date    genre(music, drama, horror..) 
23      1/1/2020          music
23      1/2/2020          horror
24      5/2/2020           drama

我想生成具有

id      count of purchases     no.of purchasesin music     no.ofpurchasesin horror  noofpurchasedrama
23             2                   1                                1                    0
24             1                    0                               0                     1

我尝试了以下代码：

a = df.groupby('id').agg({'purchasedate': lambda a: a.count()})
 But how to get new attributes of genre?

标签： pythonpython-3.x

它真的必须是计数值的列吗？如果将来添加其他类型怎么办？这会改变你的结果表的列数，并且取决于你是怎么做的，也许你甚至不得不改变你的代码。

相反，您可以执行以下操作，每个出现的 id 和流派组合为您提供一行，并添加该设置的购买次数：

df.groupby(['id', "genre"]).size().reset_index(name="purchases")

   id   genre  purchases
0  23  horror          1
1  23   music          1
2  24   drama          1

对于总值，使用第二个查询

df.groupby('id').size().reset_index(name="total_purchases")

   id  total_purchases
0  23                2
1  24                1

它的结构与您所要求的有点不同，但它可以很好地适应您数据库中不断增长的流派列表。

python - 实现基于流派分组购买的python代码

问题描述

解决方案

推荐阅读