首页 > 解决方案 > 从模式重复的 Python 列表中删除重复字符

问题描述

我正在监视一个发送如下数据的串行端口:

['','a','a','a','a','a','a','','b','b','b','b','b','b','b','b',
 '','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d',
 '','','e','e','e','e','e','e','','','a','a','a','a','a','a',
 '','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c',
 '','','','d','d','d','d','d','d','','','e','e','e','e','e','e',
 '','','a','a','a','a','a','a','','b','b','b','b','b','b','b','b',
 '','','c','c','c','c','c','c','','','','d','d','d','d','d','d','d','d',
 '','','e','e','e','e','e','e','','','a','a','a','a','a','a',
 '','','','b','b','b','b','b','b','b','b','b','','','c','c','c','c','c','c',
 '','','','d','d','d','d','d','d','','','e','e','e','e','e','e','','']

我需要能够将其转换为:

['a','b','c','d','a','b','c','d','a','b','c','d','a','b','c','d']

所以我要删除重复项和空字符串,但也要保留模式重复的次数。

我一直无法弄清楚。有人可以帮忙吗?

标签: pythonpython-3.xlist

解决方案


这是使用列表推导和itertools.zip_longest的解决方案:仅当元素不是空字符串且不等于下一个元素时才保留该元素。您可以使用迭代器跳过第一个元素,以避免切片列表的成本。

from itertools import zip_longest

def remove_consecutive_duplicates(lst):
    ahead = iter(lst)
    next(ahead)
    return [ x for x, y in zip_longest(lst, ahead) if x and x != y ]

用法:

>>> remove_consecutive_duplicates([1, 1, 2, 2, 3, 1, 3, 3, 3, 2])
[1, 2, 3, 1, 3, 2]
>>> remove_consecutive_duplicates(my_list)
['a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd', 'e', 'a', 'b', 'c', 'd',
 'e', 'a', 'b', 'c', 'd', 'e']

我假设没有由空字符串(例如'a', '', 'a')分隔的重复项,或者您不想删除此类重复项。如果这个假设是错误的,那么你应该先过滤掉空字符串:

>>> example = ['a', '', 'a']
>>> remove_consecutive_duplicates([ x for x in example if x ])
['a']

推荐阅读