首页 > 解决方案 > 删除具有三个元素的元组的列表中的冗余

问题描述

我有一个类似于 A 的元组列表:

 A = [[(90, 1, 5), (126, 1, 3), (139, 1, 3), (1000, 1, 5), (111, 1, 2), (176, 1, 5)], 
[(160, 2, 5), (1000, 2, 5), (111, 1, 2)], 
[(134, 3, 5), (126, 1, 3), (128, 3, 4), (139, 1, 3)], 
[(128, 3, 4)], 
[(90, 1, 5), (160, 2, 5), (134, 3, 5), (1000, 2, 5), (1000, 1, 5), (176, 1, 5)]]

在这个列表的每一行中,可能有第二个和第三个元素相同的元组。例如在 A[0] 中:

A[0] = [(90, 1, 5), (126, 1, 3), (139, 1, 3), (1000, 1, 5), (111, 1, 2), (176, 1, 5)]

(90, 1, 5), (1000, 1, 5) 和 (176, 1, 5) 具有相同的第二个和第三个元素。其中,我需要保留第一个元素的最大值并删除另外两个。所以,我应该能够从 A[0] 中保留 (1000, 1, 5) 并删除 (90, 1, 5) 和 (176, 1, 5)。

最好保持列表的顺序。

有什么方法可以对 A 中的所有行进行迭代吗?任何帮助,将不胜感激!

标签: pythonlisttuples

解决方案


如果我理解正确,这是一个itertools.groupby解决方案。我假设最终结果中的顺序无关紧要。

from itertools import groupby

def keep_max(lst, groupkey, maxkey):
    'groups lst w.r.t. to groupkey, keeps maximum of each group w.r.t. maxkey'
    sor = sorted(lst, key=groupkey)
    groups = (tuple(g) for _, g in groupby(sor, key=groupkey))
    return [max(g, key=maxkey) for g in groups]

在行动:

>>> from operator import itemgetter
>>> groupkey = itemgetter(1, 2)
>>> maxkey = itemgetter(0)
>>> A = [[(90, 1, 5), (126, 1, 3), (139, 1, 3), (1000, 1, 5), (111, 1, 2), (176, 1, 5)], [(160, 2, 5), (1000, 2, 5), (111, 1, 2)], [(134, 3, 5), (126, 1, 3), (128, 3, 4), (139, 1, 3)], [(128, 3, 4)], [(90, 1, 5), (160, 2, 5), (134, 3, 5), (1000, 2, 5), (1000, 1, 5), (176, 1, 5)]]
>>>
>>> [keep_max(sub, groupkey, maxkey) for sub in A]
[[(111, 1, 2), (139, 1, 3), (1000, 1, 5)],
 [(111, 1, 2), (1000, 2, 5)],
 [(139, 1, 3), (128, 3, 4), (134, 3, 5)],
 [(128, 3, 4)],
 [(1000, 1, 5), (1000, 2, 5), (134, 3, 5)]]

推荐阅读