首页 > 解决方案 > python remove duplications entirely using two keys

问题描述

Given a list which contains dictionaries that every dictionary has A, B and C keys I'm looking to delete duplications (All including the original too) from that set according to only A & C keys. for example: given the following:

set=[{'A':1,'B':4,:'C':2},{'A':5,'B':6,'C':0},{'A':1,'B':5,'C':2},{'A':6,'B':1,'C':9}]

I'm expecting

set=[{'A':5,'B':6,'C':0},{'A':6,'B':1,'C':9}]

标签: pythonpython-3.x

解决方案


实现结果的一种方法是将列表转换为dataframe然后使用drop_duplicates删除重复的行,然后再转换回字典列表。

In [33]: set1=[{'A':1,'B':4,'C':2},{'A':5,'B':6,'C':0},{'A':1,'B':5,'C':2},{'A':6,'B':1,'C':9}]

In [34]: set1
Out[34]:
[{'A': 1, 'B': 4, 'C': 2},
 {'A': 5, 'B': 6, 'C': 0},
 {'A': 1, 'B': 5, 'C': 2},
 {'A': 6, 'B': 1, 'C': 9}]

In [35]: df = pd.DataFrame(set1)

In [36]: df
Out[36]:
   A  B  C
0  1  4  2
1  5  6  0
2  1  5  2
3  6  1  9

In [38]: df.drop_duplicates(subset=['A','C'],keep=False,inplace=True)

In [39]: df
Out[39]:
   A  B  C
1  5  6  0
3  6  1  9

In [40]: df.to_dict(orient='records')
Out[40]: [{'A': 5, 'B': 6, 'C': 0}, {'A': 6, 'B': 1, 'C': 9}]

推荐阅读