python-3.x - Pandas - 与两列合并时的数据透视表
问题描述
我有一个包含两个分类列和几个数字列的表。我希望在将两个分类列合并到数字列的名称中时,在某个索引上旋转表(在示例中索引是客户端)。有没有办法在没有循环的情况下做到这一点?
这是一个示例表:
客户 | 项目 | 案子 | 开支 | value_mean | value_std |
---|---|---|---|---|---|
A先生 | 项目_15 | 4 | 100122 | 9655.566667 | 12.70499327 |
A先生 | 项目_15 | 2 | 4015 | 9653 | 17.66352173 |
A先生 | 项目_15 | 7 | 8953 | 94.4 | 2.065591118 |
A先生 | 项目_16 | 1 | 3922 | 65519.4 | 0.894427191 |
A先生 | 项目_16 | 6 | 8953 | 21093.5 | 17816.03006 |
A先生 | 项目_16 | 7 | 8953 | 30665.3 | 30643.27374 |
A先生 | 项目_16 | 2 | 4015 | 65517.8 | 1.788854382 |
A先生 | 项目_16 | 4 | 100122 | 65518.86667 | 1.153402392 |
A先生 | 项目_16 | 5 | 3109 | 65519 | 1.632993162 |
A先生 | 项目_18 | 4 | 100122 | 78.84444444 | 16.89884719 |
A先生 | 项目_18 | 5 | 3109 | 5820 | 6735.594059 |
B先生 | 项目_15 | 7 | 9063 | 94.6 | 1.646545205 |
B先生 | 项目_15 | 2 | 4015 | 9636 | 14.38749457 |
B先生 | 项目_15 | 6 | 8968 | 93.6 | 1.264911064 |
B先生 | 项目_16 | 5 | 4016 | 65519 | 1.414213562 |
B先生 | 项目_16 | 6 | 8968 | 22375.3 | 16701.95844 |
B先生 | 项目_16 | 7 | 9063 | 36482.5 | 31091.74401 |
B先生 | 项目_16 | 4 | 98966 | 65518.78 | 1.133333333 |
B先生 | 项目_18 | 1 | 2906 | 79.5 | 1.914854216 |
B先生 | 项目_18 | 5 | 4016 | 6257 | 6399.977109 |
B先生 | 项目_18 | 6 | 8968 | 13304.3 | 52.38330947 |
B先生 | 项目_18 | 2 | 4015 | 78.8 | 1.095445115 |
我想要这样的东西:
客户 | Project_15_Case_4_Spending | Project_15_Case_4_value_mean | Project_15_Case_4_value_std | Project_15_Case_2_Spending | Project_15_Case_2_value_mean | Project_15_Case_2_value_std | ... |
---|---|---|---|---|---|---|---|
A先生 | 100122 | 9655.566667 | 12.70499327 | 4015 | 9653 | 17.66352173 | ... |
B先生 | 9063 | 94.6 | 1.646545205 | 4015 | 9636 | 14.38749457 | ... |
谢谢你的帮助。
解决方案
使用.stack()
+ .unstack()
:然后您可以将列序列保持在原始逐行 3 列序列之后,如预期输出中的列序列所示:
df2 = df.set_index(['Client', 'Project', 'Case']).stack().unstack([1,2]).unstack()
df2.columns = df2.columns.map(lambda x: f'{x[0]}_{x[1]}_{x[2]}')
df2 = df2.reset_index()
结果:
print(df2)
Client Project_15_4_Spending Project_15_4_value_mean Project_15_4_value_std Project_15_2_Spending Project_15_2_value_mean Project_15_2_value_std Project_15_7_Spending Project_15_7_value_mean Project_15_7_value_std Project_16_1_Spending Project_16_1_value_mean Project_16_1_value_std Project_16_6_Spending Project_16_6_value_mean Project_16_6_value_std Project_16_7_Spending Project_16_7_value_mean Project_16_7_value_std Project_16_2_Spending Project_16_2_value_mean Project_16_2_value_std Project_16_4_Spending Project_16_4_value_mean Project_16_4_value_std Project_16_5_Spending Project_16_5_value_mean Project_16_5_value_std Project_18_4_Spending Project_18_4_value_mean Project_18_4_value_std Project_18_5_Spending Project_18_5_value_mean Project_18_5_value_std Project_15_6_Spending Project_15_6_value_mean Project_15_6_value_std Project_18_1_Spending Project_18_1_value_mean Project_18_1_value_std Project_18_6_Spending Project_18_6_value_mean Project_18_6_value_std Project_18_2_Spending Project_18_2_value_mean Project_18_2_value_std
0 MisterA 100122.0 9655.566667 12.704993 4015.0 9653.0 17.663522 8953.0 94.4 2.065591 3922.0 65519.4 0.894427 8953.0 21093.5 17816.03006 8953.0 30665.3 30643.27374 4015.0 65517.8 1.788854 100122.0 65518.86667 1.153402 3109.0 65519.0 1.632993 100122.0 78.844444 16.898847 3109.0 5820.0 6735.594059 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 MisterB NaN NaN NaN 4015.0 9636.0 14.387495 9063.0 94.6 1.646545 NaN NaN NaN 8968.0 22375.3 16701.95844 9063.0 36482.5 31091.74401 NaN NaN NaN 98966.0 65518.78000 1.133333 4016.0 65519.0 1.414214 NaN NaN NaN 4016.0 6257.0 6399.977109 8968.0 93.6 1.264911 2906.0 79.5 1.914854 8968.0 13304.3 52.383309 4015.0 78.8 1.095445
推荐阅读
- regex - grep 的贪婪行为
- android - AppCompatTextView 内的文本超出范围
- python - 在保存在 Django 模型中之前,有什么方法可以从 ManyToManyField 中获取选定的值
- python - Resample() 为不存在的日期返回不正确的数字
- ssl - Cert-Manager 为 AKS 提供自己的 SSL 证书
- php - 将 PHP 脚本转换为 Unix 可执行文件
- javascript - 在 ngx-editor 中,单击按钮时如何在光标位置插入 html 块
- arrays - 角度循环遍历对象数组
- javascript - 抛出错误:Describe 不是函数。运行 Mocha 测试时出现此错误
- java - Java 编译器中是否有一个全局设置来禁用 String.formatted() 的弃用?