首页 > 解决方案 > 从 Python 中的列表中删除相似的字符串

问题描述

我在 python 中有一个列表,我想删除可能已经与另一个元素连接的元素。

示例 1:

list1 = ['data', 'entry', 'data entry']

output = ['data entry']

期望在数据条目已经存在时删除“数据”和“条目”。

示例 2:

list1 = ['dining table', 'table', 'dining']

output = ['dining table']

标签: python-3.xlist

解决方案


您应该检查可以根据空格分隔符拆分的项目,如果拆分字符串的任何部分出现在原始列表中,则应删除该元素。

我编写了一个包含许多注释和测试的示例代码。

代码:

list1 = ["data", "entry", "data entry"]
list2 = ["dining table", "table", "dining"]
list3 = ["very fast car", "car", "bird", "fast"]


def my_func(my_list):
    for x in my_list:  # Iter through the getting list
        if " " not in x:  # Check if single element
            continue  # Get the nex element if single element
        for y in x.split(" "):  # Split the element on space delimiter
            if y in my_list:  # If the part of element is in the getting list
                my_list.remove(y)  # Remove the existing single element from list.


my_func(list1)
my_func(list2)
my_func(list3)

print(list1)
print(list2)
print(list3)

输出:

>>> python3 test.py
['data entry']
['dining table']
['very fast car', 'bird']

编辑:

如果要检查拆分字符串的所有组合,可以使用itertools模块 ( chain, combinations)。

使用此解决方案:

输入:["very fast car", "car", "bird", "fast", "very fast", "very car"]

输出:["very fast car", "bird"]

这意味着"very fast"and"very car"元素也将被删除,因为它们是"very fast car"元素部分的组合。

代码:

from itertools import chain, combinations

list1 = ["data", "entry", "data entry"]
list2 = ["dining table", "table", "dining"]
list3 = ["very fast car", "car", "bird", "fast"]
list4 = ["very fast car", "car", "bird", "fast", "very fast", "very car"]


def powerset(iterable):
    s = list(iterable)  # allows duplicate elements
    return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))  # Return the all combinations


def my_func(my_list):
    for x in my_list:  # Iter through the getting list
        if " " not in x:  # Check if single element
            continue  # Get the nex element if single element
        for combo in powerset(x.split(" ")):  # Iterate through the all combination
            related_string = " ".join(combo)  # Create space separated strings from list elements
            if related_string == x:  # Skip the removing if the created string is same as the tested element
                continue
            if related_string in my_list:  # If the part of element is in the getting list
                my_list.remove(related_string)  # Remove the existing single element from list.


my_func(list1)
my_func(list2)
my_func(list3)
my_func(list4)

print(list1)
print(list2)
print(list3)
print(list4)

输出:

>>> python3 test.py
['data entry']
['dining table']
['very fast car', 'bird']
['very fast car', 'bird']

推荐阅读