python-3.x - 在熊猫数据框中分组
问题描述
我有这个输入数据框:
Test1 Test2 Test3 Subject
0 45 NaN NaN Python
1 50 NaN NaN Python
2 NaN 30 NaN Python
3 NaN 35 NaN OS
4 NaN 38 NaN OS
5 NaN 43 NaN Java
6 NaN 32 NaN DS
7 NaN NaN 49 DS
8 NaN 25 NaN DS
9 NaN 34 NaN DS
预期输出是(数据框):
Subject Test1 Test2 Test3
Python 45,50 30
OS 35,38
Java 43
DS 32,25,34 49
我试过这段代码:
df.groupby(['subject']).sum().reset_index().assign(subject =lambda x: x['subject'].where(~x['subject'].duplicated(), '')).to_csv('filename.csv', index=False)
它没有提供所需的输出。
解决方案
使用删除缺失值的自定义函数Series.dropna
,如有必要,转换为整数,然后如果某些数值转换为string
s 并使用join
:
f = lambda x: ','.join(x.dropna().astype(int).astype(str))
df = df.groupby('Subject', sort=False).agg(f).reset_index()
print (df)
Subject Test1 Test2 Test3
0 Python 45,50 30
1 OS 35,38
2 Java 43
3 DS 32,25,34 49
如果许多不同格式的值(例如,某些列是数字而某些字符串),则另一个不转换为整数的想法:
f = lambda x: ','.join(x.dropna().astype(str))
df = df.groupby('Subject', sort=False).agg(f).reset_index()
print (df)
Subject Test1 Test2 Test3
0 Python 45.0,50.0 30.0
1 OS 35.0,38.0
2 Java 43.0
3 DS 32.0,25.0,34.0 49.0
推荐阅读
- php - 如何在 NodeJS 应用程序中验证 PHP 哈希密码
- web-services - Creating new customer from API doesn't set the Associations group id
- python - Tkinter 单选按钮奇怪的外观
- c++11 - RabbitMQ - SimpleAmqpClient - 我正在尝试将标头与我的消息一起发送,但标头没有被发送;我究竟做错了什么?
- asp.net-core - Authorization not working, It does not forbidden users to access my actions in asp.net core
- javascript - Updating this.state with setState doesn't rerender component
- ios - kotlin.UInt cannot be cast to kotlinx.cinterop.CValuesRef
- python - 带有重复标记的字符串的正则表达式
- javascript - jQuery handle any user action
- python - ForeignKey 不显示数据