首页 > 解决方案 > 如何从熊猫中的联合数据框左侧清除重复条目?

问题描述

为了在 pandas 中可视化和导出宽的左关节数据框,我想从左侧删除重复的条目。

我这是什么意思?

import pandas as pd

cities = pd.DataFrame().append([
    {"Name": "Peter", "City": "Boston"},
    {"Name": "Paul", "City": "Houston"}
    ], ignore_index=True)
emails = pd.DataFrame().append(    [
        {"Name": "Peter", "Email": "peter@company.com"},
        {"Name": "Peter", "Email": "peter@university.edu"},
        {"Name": "Paul", "Email": "paul@company.com"},
    ], ignore_index=True)

print(cities.merge(emails))

这打印

    Name     City                 Email
0  Peter   Boston     peter@company.com
1  Peter   Boston  peter@university.edu
2   Paul  Houston      paul@company.com

我想打印的是

    Name     City                 Email
0  Peter   Boston     peter@company.com
1                  peter@university.edu
2   Paul  Houston      paul@company.com

我怎样才能实现这一点,理想情况下是在连接期间,这样我就不必跟踪哪些列来自前左侧和右侧?

标签: pythonpandasdataframejoin

解决方案


按所有列使用Series.duplicated,然后设置''DataFrame.mask

df = cities.merge(emails)

df1 = df.mask(df.apply(pd.Series.duplicated), '')
print (df1)
    Name     City                 Email
0  Peter   Boston     peter@company.com
1                  peter@university.edu
2   Paul  Houston      paul@company.com

推荐阅读