python - How can I "concat" rows by same value in a column in Pandas?
问题描述
I would like to concat rows value in one row in a dataframe, given one column. Then I would like to receive an edited dataframe.
Input Data :
ID F_Name L_Name Address SSN Phone
123 Sam Doe 123 12345 111-111-1111
123 Sam Doe 123 12345 222-222-2222
123 Sam Doe abc345 12345 111-111-1111
123 Sam Doe abc345 12345 222-222-2222
456 Naveen Gupta 456 45678 333-333-3333
456 Manish Gupta 456 45678 333-333-3333
Expected Output Data :
myschema = {
"ID":"123"
"F_Name":"Sam"
"L_Name":"Doe"
"Addess":"[123, abc345]"
"Phone":"[111-111-1111,222-222-2222]"
"SSN":"12345"
}
{
"ID":"456"
"F_Name":"[Naveen, Manish]"
"L_Name":"Gupta"
"Addess":"456"
"Phone":"[333-333-333]"
"SSN":"45678"
}
Code Tried :
df = pd.read_csv('data.csv')
print(df)
解决方案
try groupby()
+agg()
:
myschema=(df.groupby('ID',as_index=False)
.agg(lambda x:list(set(x))[0] if len(set(x))==1 else list(set(x))).to_dict('r'))
OR
If order is important then aggregrate pd.unique()
:
myschema=(df.groupby('ID',as_index=False)
.agg(lambda x:pd.unique(x)[0] if len(pd.unique(x))==1 else pd.unique(x).tolist())
.to_dict('r'))
so in the above code we are grouping the dataframe on 4 columns i.e ['ID','F_Name','L_Name','SSN']
then aggregrating the result and finding the unique values by aggregrating set and typecasting that set to a list and then converting the aggregrated result to list of dictionary and then selecting the value at 0th postion
output of myschema
:
[{'ID': 123,
'F_Name': 'Sam',
'L_Name': 'Doe',
'Address': ['abc345', '123'],
'SSN': 12345,
'Phone': ['222-222-2222', '111-111-1111']},
{'ID': 456,
'F_Name': ['Naveen', 'Manish'],
'L_Name': 'Gupta',
'Address': '456',
'SSN': 45678,
'Phone': '333-333-3333'}]
推荐阅读
- python - 继续收到 .float64 没有可调用的 rint 属性的错误
- video - 在没有 hwdownload 的情况下将 ffpmeg OpenCL 过滤器输出传递给 NVenc?
- llvm - LLVM::predecessors 可以返回重复的基本块指针
- php - 如何在php中比较两个相同或不同的MongoDB对象ID
- html - 使用后如何让导航栏消失?
- python - 在请求模块中发布不能正常工作(python)
- linux - OpenLiteSpeed 性能不佳和 SSL 握手失败 (5)
- python - 从“PermissionError:[Errno 13] Permission denied:”到“PermissionError:[WinError 5] Accès refusé:”
- python-3.x - 如何使用 p2p 连接访问 nat 后面的 Web 服务器
- python-3.x - 将文件从谷歌驱动器导入谷歌 colab,但让其他人可以访问代码