pandas - concise way of flattening multiindex columns
问题描述
Using more than 1 function in a groupby-aggregate results in a multi-index which I then want to flatten.
example:
df = pd.DataFrame(
{'A': [1,1,1,2,2,2,3,3,3],
'B': np.random.random(9),
'C': np.random.random(9)}
)
out = df.groupby('A').agg({'B': [np.mean, np.std], 'C': np.median})
# example output
B C
mean std median
A
1 0.791846 0.091657 0.394167
2 0.156290 0.202142 0.453871
3 0.482282 0.382391 0.892514
Currently, I do it manually like this
out.columns = ['B_mean', 'B_std', 'C_median']
which gives me the result I want
B_mean B_std C_median
A
1 0.791846 0.091657 0.394167
2 0.156290 0.202142 0.453871
3 0.482282 0.382391 0.892514
but I'm looking for a way to automate this process, as this is monotonous, time consuming and allows me to make typos as I rename the columns.
Is there a way to return a flattened index instead of a multi-index when doing a groupby-aggregate?
I need to flatten the columns to save to a text file, which will then be read by a different program that doesn't handle multi-indexed columns.
解决方案
You can do a map
join
with columns
out.columns = out.columns.map('_'.join)
out
Out[23]:
B_mean B_std C_median
A
1 0.204825 0.169408 0.926347
2 0.362184 0.404272 0.224119
3 0.533502 0.380614 0.218105
For some reason (when the column contain int) I like this way better
out.columns.map('{0[0]}_{0[1]}'.format)
Out[27]: Index(['B_mean', 'B_std', 'C_median'], dtype='object')
推荐阅读
- python - 规范化 Python 中的字典字典
- xamarin.forms - 具有 Xamarin 表单的蓝牙 IGXMedia 上的 Gurux
- python - 如何使用 python 发布 kafka 模式
- javascript - 'ReferenceError: jest is not defined' 运行单元测试时
- c# - 如何从搜索查询中排除项目
- python - 我怎么能从类点调用一个对象,以便我可以在矩形中使用它
- c# - 需要帮助找出为什么选项不会在下拉菜单中弹出
- android - 在 JSON 对象中解析 JSON 对象
- java - WebFlux 和 Reactor 3.4.0 - 已弃用的 FluxProcessors - 如何订阅接收器?
- php - 我不是编码员,但在我的网站上收到以下 php 错误。它出现在屏幕中间的 WPBakery 悬停框上方。帮助表示赞赏