python - Optimizing execution-time to check if chars of a word are in a list python
问题描述
I am writing python2.7.15 code to access chars inside a word. How can I optimize this process, in order to check also if every word is contained inside an external list?
I have tried two versions of python2 code: version(1) is an extended version of what my code has to do, whereas in version (2) I tried a compact version of the same code.
chars_array = ['a','b','c']
VERSION (1)
def version1(word):
chars =[x for x in word]
count = 0
for c in chars:
if not c in chars_array:
count+=1
return count
VERSION (2)
def version2(word):
return sum([1 for c in [x for x in word] if not c in chars_array])
I am analyzing a large corpus and for version1 I obtain an execution time of 8.56 sec, whereas for version2 it is 8.12 sec.
解决方案
The fastest solution (can be up to 100x faster for an extremely long string):
joined = ''.join(chars_array)
def version3(word):
return len(word.translate(None, joined))
Another slower solution that is approximately the same speed as your code:
from itertools import ifilterfalse
def version4(word):
return sum(1 for _ in ifilterfalse(set(chars_array).__contains__, word))
Timings (s
is a random string):
In [17]: %timeit version1(s)
1000 loops, best of 3: 79.9 µs per loop
In [18]: %timeit version2(s)
10000 loops, best of 3: 98.1 µs per loop
In [19]: %timeit version3(s)
100000 loops, best of 3: 4.12 µs per loop # <- fastest
In [20]: %timeit version4(s)
10000 loops, best of 3: 84.3 µs per loop
推荐阅读
- xcode - Swift 故事板和 Hello World
- python - 尝试抓取我的蜘蛛时出错 (NotImplementedError)
- sequelize.js - 此 sql 的等效 sequelize [Op.any] 语句?
- python - 重温旧代码 - AI(无法正常工作)
- python - 如何理解生成器中的作用域?
- python - django 自定义博客文章网址
- delphi - 我在 Delphi 10.3.3 的 OnClick 事件中遇到堆栈溢出错误
- batch-file - 尝试使用 Windows 批处理删除所有多余的空格,同时在值之间只留一个空格
- c# - 为测试目的生成 X509Certificate2 的最简单方法
- c# - 在 C# 中更改数据库中 aTable 的属性的静态 Void 方法上编写单元测试