首页 > 解决方案 > 优化给定字符串中单词列表的出现次数(Python)

问题描述

我正在创建一个计算searched_words传递字符串中出现的函数。结果是一个字典,其中匹配的单词作为键,它们的出现作为值。

我已经创建了一个函数来完成这个,但它的优化很差。

def get_words(string, searched_words):
    words = string.split()

    # O(nm) where n is length of words and m is length of searched_words
    found_words = [x for x in words if x in searched_words]

    # O(n^2) where n is length of found_words
    words_dict = {}
    for word in found_words:
        words_dict[word] = found_words.count(word)

    return words_dict


print(get_words('pizza pizza is very cool cool cool', ['cool', 'pizza']))
# Results in {'pizza': 2, 'cool': 3}

我试图使用CounterPythoncollections模型中的功能,但似乎无法重现所需的输出。似乎使用set数据类型也可以解决我的优化问题,但我不确定如何在使用集合时计算单词出现次数。

标签: pythondictionaryoptimizationlist-comprehension

解决方案


您认为使用以下方法有一个很好的解决方案是正确的Counter

from collections import Counter

string = 'pizza pizza is very cool cool cool'
search_words = ['cool', 'pizza']
word_counts = Counter(string.split())

# If you want to get a dict only containing the counts of words in search_words:
search_word_counts = {wrd: word_counts[wrd] for wrd in search_words}

推荐阅读