python - 有没有一种 Pythonic 方式来交互两个数据帧并比较它们的行?
问题描述
鉴于以下两个数据框:
df1 # 从 excel 电子表格中读取
data1 = {'ID':['1','2'],
'Prod Family Desc':['Install','Maintenance'], 'Prod Family Code':['',''],
'Prod Type Desc':['Installation Serice','Maintenance Service'],'Prod Type Code':['',''],
}
df1 = pd.DataFrame(data1)
print(df1)
结果df1:
ID Prod Family Desc Prod Family Code Prod Type Desc Prod Type Code
0 1 Install Installation Serice
1 2 Maintenance Maintenance Service
df2 # 是 SQL 查询的结果
data2 = {'Prod Class':['F','F','T','T'],
'Prod Desc':['Install','Maintenance','Installation Serice','Maintenance Service'],'Prod Code':['2525','2534','H123','H321']
}
df2 = pd.DataFrame(data2)
print(df2)
结果df2:
Prod Class Prod Desc Prod Code
0 F Install 2525
1 F Maintenance 2534
2 T Installation Serice H123
3 T Maintenance Service H321
将df2 中的Prod Family Code和Prod Type Code的值分配给df1 上的Prod Family Code和Prod Type Code列的最佳方法是什么?
我正在这样做:
stype = df2.loc[df2['Prod Class'] == "T"]
family = df2.loc[df2['Prod Class'] == "F"]
for i, concaterow in df1.iterrows():
for j, styp in stype.iterrows():
if (concaterow['Prod Type Desc'] == styp['Prod Desc']):
df1.loc[i,'Prod Type Code'] = styp['Prod Code']
for j, scat in family.iterrows():
if (concaterow['Prod Family Desc'] == scat['Prod Desc']):
df1.loc[i,'Prod Family Code'] = scat['Prod Code']
print(df1)
结果如预期:
ID Prod Family Desc Prod Family Code Prod Type Desc Prod Type Code
0 1 Install 2525 Installation Serice H123
1 2 Maintenance 2534 Maintenance Service H321
这种操作有什么pythonic方法吗?
#
**编辑@FatihAkici 问题的答案。
@FatihAkici - 由于 df2 是 SQL 查询的结果,我的预期结果是插入表中的最新值。因此,给定 df2 如下:
data2 = {'Prod Class':['F','F','F','T','T'], 'Prod Desc':['Install','Maintenance','Install','Installation Serice','Maintenance Service'],'Prod Code':['2525','2534','2536','H123','H321'] } ```
The expected result would be:
```ID Prod Family Desc Prod Family Code Prod Type Desc Prod Type Code
0 1 Install 2536 Installation Serice H123
1 2 Maintenance 2534 Maintenance Service H321
解决方案
您可以结合pd.DataFrame.assign
和pd.DataFrame.merge
:
df1.assign(**{
"Prod Family Code" : df1.merge(df2, left_on = "Prod Family Desc", right_on = "Prod Desc")["Prod Code"],
"Prod Type Code" : df1.merge(df2, left_on = "Prod Type Desc", right_on = "Prod Desc")["Prod Code"]})
在您的示例中,您的数据框 df1 包含 2 个空列
Prod Family Code
并Prod Type Code
接收结果,但这不是此方法的要求
推荐阅读
- mysql - 如何找到客户在特定日期范围内的最大点击次数?
- sql - 关于使用存储在记录中的数据库名称的跨数据库子查询
- java - Androidx 迁移:NoClassDefFoundError
- tcp - 针对大量小数据请求的管道与 TCP/IP 的性能
- ios - 在 Cordova for iOS 中设置自动填充
- c# - AmazonAWSHealthClient.DescribeEventsRequest:没有这样的主机是已知的
- sql-server - 使用 AS400 链接服务器将子查询传递到 OpenQuery
- accessibility - 如何增强网络上二维码的可访问性?
- python-3.x - 从 Pandas 数据框中删除列中具有特定值的重复行
- regex - 根据谷歌表中两个数据集中的条件过滤数据(不在)