首页 > 解决方案 > Pandas DataFrame 有条件地合并

问题描述

我想将两个数据框与以下脚本合并:

import pandas as pd

dfa = pd.DataFrame({'A': ['A0', 'A1', 'A2'],
 'B': ['B0', 'B1', 'B2']})
dfb = pd.DataFrame({'A': ['A0', 'A1','A1', 'A2'],
 'C': ['C0', 'C1', 'C2','C3']})

dfc = dfa.merge(dfb)

for i in range(len(dfc.index)):
    if dfc['A'][i]==dfc['A'][i+1]:
        dfc.drop([i], inplace=True)

但是有一个KeyError: 4错误信息:

In [38]: runfile('C:/Users/Administrator/.spyder-py3/temp.py', wdir='C:/Users/Administrator/.spyder-py3')
Traceback (most recent call last):

  File "C:\Users\Administrator\.spyder-py3\temp.py", line 18, in <module>
    if dfc['A'][i]==dfc['A'][i+1]:

  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\series.py", line 1071, in __getitem__
    result = self.index.get_value(self, key)

  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 4730, in get_value
    return self._engine.get_value(s, k, tz=getattr(series.dtype, "tz", None))

  File "pandas/_libs/index.pyx", line 80, in pandas._libs.index.IndexEngine.get_value

  File "pandas/_libs/index.pyx", line 88, in pandas._libs.index.IndexEngine.get_value

  File "pandas/_libs/index.pyx", line 131, in pandas._libs.index.IndexEngine.get_loc

  File "pandas/_libs/hashtable_class_helper.pxi", line 992, in pandas._libs.hashtable.Int64HashTable.get_item

  File "pandas/_libs/hashtable_class_helper.pxi", line 998, in pandas._libs.hashtable.Int64HashTable.get_item

KeyError: 4

此外,当我检查dfc值时,第 1 行已经被删除:

In [30]: dfc
Out[30]: 
    A   B   C
0  A0  B0  C0
2  A1  B1  C2
3  A2  B2  C3

dfc.drop([i], inplace=True)但是,如果我在没有 的情况下对第 12 行进行编码inplace=True,我也会出现错误。

In [39]: df
Out[39]: 
    A   B   C
0  A0  B0  C0
1  A1  B1  C1
2  A1  B1  C2
3  A2  B2  C3

怎么了?

标签: pandasdataframemerge

解决方案


它会下降,但问题是

 dfc['A'][i+1]

当您打印 (dfc['A'][i])=> 时,它会给您 0 - A0,1 - A1,2 - A1,3 - A2。

因此,当您比较以下值时:

dfc['A'][0] == dfc['A'][1]
dfc['A'][1] == dfc['A'][2]
dfc['A'][2] == dfc['A'][3] 
dfc['A'][3] == dfc['A'][4]

但是dfc['A'][4]您的 DataFrame 中没有任何价值。这就是它所说的 KeyError : 4。简而言之dfc['A'][i+1],当 i = 3 时超出索引。


推荐阅读