首页 > 解决方案 > 提取列表中与字符串匹配的所有元素

问题描述

我有一个关键字列表和一个输入列表。我的任务是找到那些包含关键字(甚至是部分)的列表。我可以使用以下代码提取包含关键字的列表:


t_list = [['Subtotal: ', '1,292.80 '], ['VAT ', ' 64.64 '], ['RECEIPT TOTAL ', 'AED1,357.44 '],  
          ['NOT_SELECTED, upto2,000 ', 'Sub total ', '60.58 '], 
          ['NOT_SELECTED, upto500 ', 'amount 160.58 ', '', '3.03 '],
          ['Learn', 'Bectricity total ', '', '', '63.61 ']]

keyword = ['total ', 'amount ']

for lists in t_list:
    for string_list in table:
        string_list[:] = [item for item in string_list if item != '']
            for element in string_list:
                element = element.lower()
                if any(s in element for s in keyword):
                    print(string_list)

The output is:
 [['Subtotal: ', '1,292.80 '], ['RECEIPT TOTAL ', 'AED1,357.44 '], ['NOT_SELECTED, upto2,000 ', 'Sub total ', '60.58 '], ['NOT_SELECTED, upto500 ', 'amount 160.58 ', '3.03 '],
          ['Learn', 'Bectricity total ', '63.61 ']]

所需的输出是只有与关键字匹配的字符串和列表中的数字。

所需输出:

[['Subtotal: ', '1,292.80 '], ['RECEIPT TOTAL ', 'AED1,357.44 '], ['Sub total ', '60.58 '], ['amount 160.58 ', '3.03 '],['Bectricity total ', '63.61 ']]

如果我可以将输出作为字典,其中与关键字匹配的字符串作为键,数字作为值,那将是完美的。

提前致谢!

标签: pythonpython-3.xlist

解决方案


这是我们聊天的答案,稍作修改,添加了一些注释作为代码的一些解释。随时要求我澄清或更改任何内容。

import re

t_list = [
    ['Subtotal: ', '1,292.80 '],
    ['VAT ', ' 64.64 '],
    ['RECEIPT TOTAL ', 'AED1,357.44 '],
    ['NOT_SELECTED, upto2,000 ', 'Sub total ', '60.58 '],
    ['NOT_SELECTED, upto500 ', 'amount 160.58 ', '', '3.03 '],
    ['Learn', 'Bectricity total ', '', '', '63.61 ']
]

keywords = ['total ', 'amount ']

output = {}

for sub_list in t_list:
    # Becomes the string that matched the keyword if one is found
    matched = None

    for item in sub_list:
        for keyword in keywords:
            if keyword in item.lower():
                matched = item

    # If a match was found, then we start looking at the list again
    # looking for the numbers
    if matched:
        for item in sub_list:
            # split the string so for example 'amount 160.58 ' becomes ['amount', '160.58']
            # This allows us to more easily extract just the number
            split_items = item.split()
            for split_item in split_items:
                # Simple use of regex to match any '.' with digits either side
                re_search = re.search(r'[0-9][.][0-9]', split_item)
                if re_search:
                    # Try block because we are making a list. If the list exists, 
                    # then just append a value, otherwise create the list with the item
                    # in it
                    try:
                        output[matched.strip()].append(split_item)
                    except KeyError:
                        output[matched.strip()] = [split_item]

print(output)

您提到想要匹配一个字符串,例如'AED 63.61'. 我的解决方案是使用.split()分隔字符串并使其更容易抓住数字。例如,对于像'amount 160.58'这样的字符串,只需抓住160.58. 我不知道如何去匹配一个你想要保留的字符串,但不匹配我刚才提到的那个(当然,除非'AED'在这种情况下,我们可以添加更多逻辑来匹配任何内容'aed')。


推荐阅读