python - 如何仅将一个数据框中的行添加到某些列中的值不匹配的另一个数据框中
问题描述
我有两个数据框,df1 和 df2(如下所示),我想要 df3。所以基本上,如果重复出现在“完成”列 == 'C' 之间,则从 df1 中删除行,否则保留 df1 行并从 df2 添加剩余行。希望这是有道理的!可能有一种简单的方法可以做到这一点,而我只是让声音比实际更复杂!?
df1:
Complete Name Birth
C Steve 13/07/2000
C Mike 13/06/2000
C Sarah 20/05/1936
C Lewis 14/08/1955
NaN Martin 15/04/1990
NaN Lewis 15/04/1990
df2:
Complete Name Birth
NaN Steve 13/07/2000
NaN Mike 13/06/2000
NaN Sarah 20/05/1936
NaN Lewis 14/08/1955
NaN Martin 15/04/1990
NaN Lewis 15/04/1990
NaN Dave 13/04/1935
NaN Mark 14/07/1932
NaN Steve 15/06/1970
我希望 df1 因此成为:
Complete Name Birth
NaN Martin 15/04/1990
NaN Lewis 15/04/1990
NaN Dave 13/04/1935
NaN Mark 14/07/1932
NaN Steve 15/06/1970
解决方案
# merge both dataframes, 2 tricks, .reset_index()...set_index() will keep the original index and not reset him
# trick 2, use indicator=True which creates the column "_merge" where you can see in which dataframe the rows where found, left, right or both
df = df1.reset_index().merge(df2, on=["Complete", "Name", "Birth"], how="left", indicator=True).set_index("index")
# creates a mask (series with True / False values)
mask = (df["_merge"]=="both") & (df["Complete"] == "C")
# only keep rows where mask == True, the "~" inverts the boolean value, therefore excludes the mask
df = df[~mask]
推荐阅读
- python - Grafana LDAP 创建的用户无法通过 REST API 登录
- node.js - 为什么在 Node.js 12.13.0 上使用 export/import 语句时 ESLint 会抛出错误?
- javascript - express.js 如何捕获 ReferenceError
- javascript - 使用 Vue JS 的下拉列表
- bitbucket - 折叠拉取请求一操作位桶中的所有文件
- php - 将 mysql 子查询 + group by 转换为 laravel eloquent
- scenekit - 如何从 Autodesk Maya 生成阴影 png 文件?
- ios - WKWebView iOS 13 中的 DeviceMotion 和 DeviceOrientation
- java - 设计审查:带有“版本切换”的 Spring JPA 实体版本控制
- android - 数据绑定:将 LiveData 原样传递给 BindingAdapter 并使用 LifecycleOwner 观察它