首页 > 解决方案 > 我需要帮助弄清楚如何仅在标识符键匹配的情况下更改 pandas 数据框中的值

问题描述

想象一下我有两张桌子。

第一

customer_id first_name  gender
3343        Cristabel   female
2469        Kermie      male
996         Aura        female
1628        Hermione    female
2696        Isabelle    female

第二

customer_id first_name  gender
3343        Cristabel   u
2469        Kermie      u
996         Aura        u
1628        Hermione    u
2696        Isabelle    u
1689        Jhon        male
5698        Albert      male

我只想更改表 2 中带有 'u' 的行的性别属性以及表 1 中的相应性别。

提前感谢您的帮助。

标签: pythonpandas

解决方案


与左连接一起使用DataFrame.merge,然后用u另一列替换值_

df2 = df2.merge(df1, on=['customer_id','first_name'], how='left', suffixes=('','_'))
df2['gender'] = df2['gender'].mask(df2['gender'] == 'u', df2.pop('gender_'))
print (df2)
  customer_id first_name  gender
0         3343  Cristabel  female
1         2469     Kermie    male
2          996       Aura  female
3         1628   Hermione  female
4         2696   Isabelle  female
5         1689       Jhon    male
6         5698     Albert    male

如果只想customer_id通过两列不匹配,这里是替代方法:

mask = df2['gender'] == 'u'
s = df1.set_index('customer_id')['gender']
df2.loc[mask, 'gender'] = df2.loc[mask, 'customer_id'].map(s).fillna(df2['gender'])
print (df2)
   customer_id first_name  gender
0         3343  Cristabel  female
1         2469     Kermie    male
2          996       Aura  female
3         1628   Hermione  female
4         2696   Isabelle  female
5         1689       Jhon    male
6         5698     Albert    male

推荐阅读