首页 > 解决方案 > 在 Python 中查找两个数据帧之间的差异

问题描述

假设我有两个数据框

column1 column2 
  abc      2
  def      2

column1 column2 
  abc      2
  def      1

我想比较这两个数据框并找出差异所在并获取 column1 的值。

所以在这种情况下输出应该是'def'

标签: pythonpandasdataframe

解决方案


基于此答案here,您可以尝试pd.concat方法:

pd.concat([A,B]).drop_duplicates(keep=False)['column1'].unique().tolist()

输出:

# if you just want to see the differences between the dataframe
>>> pd.concat([A,B]).drop_duplicates(keep=False)
  column1  column2
1     def        2
1     def        1
# if you just want to see the differences and with only 'column1'
>>> pd.concat([A,B]).drop_duplicates(keep=False)['column1']
1    def
1    def
Name: column1, dtype: object
# if you want unique values in the column1 as a numpy array after taking the differences
>>> pd.concat([A,B]).drop_duplicates(keep=False)['column1'].unique()
array(['def'], dtype=object) 
# if you want unique values in the column1 as a list after taking the differences
>>> pd.concat([A,B]).drop_duplicates(keep=False)['column1'].unique().tolist() 
['def']

推荐阅读