python - pandas df中的累积计算行
问题描述
我有一个巨大的熊猫 df 如下
Country category brand quarter device countA CountB percentageA/B
XXX A1 A2 Q2 PC 12 12 100
XXX A1 A2 Q2 Tablet 2 4 50
YYY A4 A5 Q4 PC 50 50 100
YYY A4 A5 Q4 Tablet 10 10 100
我需要在数据中添加一行,即以上 2 个数据点的总和
Country category brand quarter device countA CountB percentage(A/B)
XXX A1 A2 Q2 PC 12 12 100 %
XXX A1 A2 Q2 Tablet 2 4 50 %
**XXX A1 A2 Q2 PC + Tablet 14 16 87.5%**
YYY A4 A5 Q4 PC 50 50 100
YYY A4 A5 Q4 Tablet 10 12 83%
**YYY A4 A5 Q4 PC+Tablet 60 62 96.7%**
请找到 d 的结构 所以理想情况下,该类别中只有一个设备的品牌很少
Country category brand quarter device
XXX A1 A2 Q2 Tablet +PC
A4 A5 Q2 Tablet+PC
A9 A10 Q2 PC
A11 Q1 PC
打印(类型(d))
解决方案
使用groupby
merge
和concat
另外,你仍然没有提到如何percentageA/B
计算
# groupby and apply with join to get devices
d = df.groupby(['Country','category','brand','quarter'])['device'].apply('+'.join)
# groupby with sum then merge the two groups together with reset_index
new = df.groupby(['Country','category','brand','quarter']).sum().merge(d, left_index=True, right_index=True).reset_index()
# concat original df with new
pd.concat([df,new], sort=False)
Country category brand quarter device countA CountB percentageA/B
0 XXX A1 A2 Q2 PC 12 12 100
1 XXX A1 A2 Q2 Tablet 2 4 50
2 YYY A4 A5 Q4 PC 50 50 100
3 YYY A4 A5 Q4 Tablet 10 10 100
0 XXX A1 A2 Q2 PC+Tablet 14 16 150
1 YYY A4 A5 Q4 PC+Tablet 60 60 200
或者您可以尝试:
# groupby and apply with join to get devices
d = df.groupby(['Country','category','brand','quarter'])['device'].apply('+'.join).to_frame().reset_index()
# groupby with sum then merge the two groups together with reset_index
new = df.groupby(['Country','category','brand','quarter'], as_index=False).sum().merge(d, on=['Country','category','brand','quarter'])
# concat original df with new
final_df = pd.concat([df,new], sort=False)
final_df['percentageA/B'] = final_df['countA'] / final_df['CountB'] * 100
推荐阅读
- reactjs - 重定向到登录页面后,显示 Toaster 消息
- django - Django 身份验证 JWT 与 Oauth2
- python - 使用 1d 列索引数组切片和填充 2d 数组
- mysql - MySql查询运行余额校正
- python - 生成器表达式与生成器函数以及令人惊讶的急切评估
- xamarin - Xamarin 表单列表视图 - 显示图像全宽,自动高度
- knockout.js - Knockout Binding 问题,只能通过匿名函数获取值
- php - Auth guard [:api] 没有定义?
- google-apps-script - 使用 Google Apps 脚本删除或替换 Google 表格单元格中的“未定义”字符串
- python - PyTorch BERT TypeError: forward() got an unexpected keyword argument 'labels'