首页 > 解决方案 > 比较 Python'c 计数器对象与排序列表的复杂性

问题描述

我有两个版本的逻辑来找出单词列表中的字谜:

  1. 第一个使用 Python 的集合模块并使用Counter

def anagram_checker(cls, word=None, words=[]):

    anagrams = []

    if len(words) == 0:  # if the second arg list is empty then there is nothing to check
        return anagrams
    else:
        counter_word = Counter(word)
        for given_word in words:
            if given_word != word:  # Cannot be same as the word in question
                if Counter(given_word) == counter_word:
                    anagrams.append(given_word)
    return anagrams
  1. 第二个具有相同的功能,但使用 sorted 内置函数。

def anagram_checker(cls, word=None, words=[]):

    anagrams = []

    if len(words) == 0:  # if the second arg list is empty then there is nothing to check
        return anagrams
    else:
        counter_word = list(sorted(word))
        for given_word in words:
            if given_word != word:  # Cannot be same as the word in question
                if list(sorted(given_word)) == counter_word:
                    anagrams.append(given_word)
    return anagrams

哪个具有更好的时间复杂度。我的意思是比较 Python Counter 对象具有更好的复杂性还是比较排序列表具有更好的时间复杂度?

如果我没有错,比较列表的复杂度为 O(n) 对。比较两个 Counter 对象的复杂度是多少。

我搜索了各种文档,但没有找到任何令人满意的答案。

请帮忙。

标签: python

解决方案


我做了一些测量,虽然比较列表和计数器都是 O(n),创建计数器是 O(n),这比排序 O(n.log n) 好,但anagram_checkerwith 排序更快。

from timeit import timeit
from collections import Counter

def anagram_checker_1(word=None, words=[]):
    anagrams = []

    if len(words) == 0:  # if the second arg list is empty then there is nothing to check
        return anagrams
    else:
        counter_word = Counter(word)
        for given_word in words:
            if given_word != word:  # Cannot be same as the word in question
                if Counter(given_word) == counter_word:
                    anagrams.append(given_word)
    return anagrams


def anagram_checker_2(word=None, words=[]):
    anagrams = []

    if len(words) == 0:  # if the second arg list is empty then there is nothing to check
        return anagrams
    else:
        counter_word = list(sorted(word))
        for given_word in words:
            if given_word != word:  # Cannot be same as the word in question
                if list(sorted(given_word)) == counter_word:
                    anagrams.append(given_word)
    return anagrams

print(timeit("anagram_checker_1('battle', ['battet', 'batlet', 'battel', 'tablet'])", globals=globals(), number=100_000))
print(timeit("anagram_checker_2('battle', ['battet', 'batlet', 'battel', 'tablet'])", globals=globals(), number=100_000))

输出:

2.3342012430075556
0.47786532100872137

分析 anagram 1 显示了这一点:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   500000    0.662    0.000    2.512    0.000 /usr/lib/python3.6/collections/__init__.py:517(__init__)
   500000    0.501    0.000    0.821    0.000 /usr/lib/python3.6/abc.py:180(__instancecheck__)
   500000    0.438    0.000    1.808    0.000 /usr/lib/python3.6/collections/__init__.py:586(update)
   100000    0.433    0.000    2.978    0.000 example.py:4(anagram_checker_1)
  1000006    0.320    0.000    0.320    0.000 /usr/lib/python3.6/_weakrefset.py:70(__contains__)
   500000    0.283    0.000    0.283    0.000 {built-in method _collections._count_elements}
   500002    0.225    0.000    1.047    0.000 {built-in method builtins.isinstance}
  1100000    0.090    0.000    0.090    0.000 {built-in method builtins.len}
        1    0.042    0.042    3.020    3.020 <timeit-src>:2(inner)
   300000    0.025    0.000    0.025    0.000 {method 'append' of 'list' objects}

因此可以清楚地看到,Python 在创建Counter对象方面的开销在这里覆盖了任何算法复杂性的优势。

编辑:

Anagram 2 的分析报告,用于比较:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   500000    0.372    0.000    0.372    0.000 {built-in method builtins.sorted}
   100000    0.353    0.000    0.762    0.000 example.py:18(anagram_checker_2)
        1    0.041    0.041    0.803    0.803 <timeit-src>:2(inner)
   300000    0.028    0.000    0.028    0.000 {method 'append' of 'list' objects}
   100000    0.009    0.000    0.009    0.000 {built-in method builtins.len}

推荐阅读