首页 > 解决方案 > 如何匹配列表中列表中的相同字符并分别导出结果

问题描述

伙计们!我得到一个列表(final_word_list),我想从名为“texts_under_directory”的列表下的所有子列表中匹配相同的字符,并单独导出结果,如图所示

from nltk.tokenize import word_tokenize

final_word_list = ['zero', 'two', 'four', 'six', 'eight', 'ten', 'twelve', 'fourteen', 'sixteen']

texts_under_directory = [['one', 'two', 'three', 'four', 'five', 'six'], ['five', 'six', 'seven', 'eight', 'nine', 'ten'], ['eight', 'nine', 'ten', 'eleven', 'twelve', 'thirteen']]

# texts_under_directory[0] = ['one', 'two', 'three', 'four', 'five', 'six']
# texts_under_directory[1] = ['five', 'six', 'seven', 'eight', 'nine', 'ten']
# texts_under_directory[2] = ['eight', 'nine', 'ten', 'eleven', 'twelve', 'thirteen']

final_result = []

i = 0
while i < len(texts_under_directory):
    for b in texts_under_directory[i]:
        for a in final_word_list:
            if a == b:
                for x in word_tokenize(b):
                    final_result.append(x)

    print(sorted(set(final_result)))

    i += 1

输出是:

['four', 'six', 'two']
['eight', 'four', 'six', 'ten', 'two']
['eight', 'four', 'six', 'ten', 'twelve', 'two']

我的预期输出是:

['four', 'six', 'two']
['eight', 'six', 'ten']
['eight' 'ten', 'twelve'] 

标签: listfor-loopwhile-loop

解决方案


好的,我自己找到答案。我把它留在下面。

from nltk.tokenize import word_tokenize

final_word_list = ['zero', 'two', 'four', 'six', 'eight', 'ten', 'twelve', 'fourteen', 'sixteen']

texts_under_directory = [['one', 'two', 'three', 'four', 'five', 'six'], ['five', 'six', 'seven', 'eight', 'nine', 'ten'], ['eight', 'nine', 'ten', 'eleven', 'twelve', 'thirteen']]

# texts_under_directory[0] = ['one', 'two', 'three', 'four', 'five', 'six']
# texts_under_directory[1] = ['five', 'six', 'seven', 'eight', 'nine', 'ten']
# texts_under_directory[2] = ['eight', 'nine', 'ten', 'eleven', 'twelve', 'thirteen']

n = 3
final_result = [[] for _ in range(n)]

i = 0
while i < len(texts_under_directory):
    for b in texts_under_directory[i]:
        for a in final_word_list:
            if a == b:
                for x in word_tokenize(b):
                    final_result[i].append(x)

    print(sorted(set(final_result[i])))

    i += 1

输出

['four', 'six', 'two']
['eight', 'six', 'ten']
['eight', 'ten', 'twelve']

推荐阅读