python - 在 for 循环中并行执行相同的函数
问题描述
使用 Python 2.7,我创建了一个示例字典和几个函数来子集该字典并遍历子集......
from itertools import islice
from multiprocessing import Process
from collections import OrderedDict
global pair_dict
pair_dict = {
1: 'one',
2: 'two',
3: 'three',
4: 'four',
5: 'five',
6: 'six',
7: 'seven',
8: 'eight'
}
global test_printer
def test_printer(start_chunk, end_chunk):
fin_dict = OrderedDict(sorted(pair_dict.items()))
sub_dict = dict(fin_dict.items()[start_chunk:end_chunk])
for key, value in sub_dict.iteritems():
print key, value
print '-' * 50
def set_chunk_start_end_points():
# Takes the dictionary and chunks for parallel execution.
for i in range(2, 9, 2):
start_chunk = i - 2
end_chunk = i
test_printer(start_chunk, end_chunk)
#first = Process(target=test_printer, args=(start_chunk, end_chunk)).start()
set_chunk_start_end_points()
...我已经看到了多处理使用的示例,但似乎没有一个适合我正在尝试做的事情。示例代码创建四个子字典并串行执行它们。我正在寻找它们并行运行。
如果您注释掉该行test_printer(start_chunk, end_chunk)
并取消注释它下面的行,我希望看到相同的输出,只是 Python 使用了多个线程来执行此操作。然而,现在什么也没有发生。
我究竟做错了什么?
谢谢
解决方案
我总是发现 pool.map 是并行执行相同功能的最简单方法。也许你会发现它很有帮助。
from itertools import islice
from multiprocessing import Pool as ProcessPool # easier to work with for this sort of thing
from collections import OrderedDict
# You were using globals wrong. But that's a separate topic.
pair_dict = {
1: 'one',
2: 'two',
3: 'three',
4: 'four',
5: 'five',
6: 'six',
7: 'seven',
8: 'eight'
}
# This only needs to be executed once. Not every time the function is called.
fin_dict = OrderedDict(sorted(pair_dict.items()))
def test_printer(chunk): # Going to make this take 1 argument. Just easier.
start_chunk = chunk[0]
end_chunk = chunk[1] # All things considered this should be called chunk_end, not end_chunk
# list for python3 compatibility
sub_dict = dict(list(fin_dict.items())[start_chunk:end_chunk])
# .items() for python3 compatibility
for key, value in sub_dict.items():
print(key, value) # Looks like you're still using Python2.7? Upgrade friend. Little support for that stuff anymore.
print('-' * 50)
def set_chunk_start_end_points():
# Takes the dictionary and chunks for parallel execution.
# comment: Does it? This function takes no arguments from what I can see.
# Think through your comments carefully.
# Let's calculate the chunks upfront:
chunks = [(i-2, i) for i in range(2,9,2)]
with ProcessPool(4) as pool: # however many processes you want
pool.map(test_printer, chunks)
set_chunk_start_end_points()
请注意,除非您想要特定的块,否则 pool.map 会为您分块。在这种情况下,它实际上是对我们的块列表进行分块!
推荐阅读
- angular-cli - 错误:EPERM:不允许操作,读取
- ruby-on-rails - 将 Rails 6 应用程序部署到 Elastic Beanstalk 时出现 Bundler 错误
- python - 通过网络抓取计算 HTML 标记的数量
- python - 用列/系列中的值替换熊猫子字符串
- swift - 如何将数组值放入按钮列表中,根据单击的内容打开新视图
- excel - 这个 VBA 重复标记会更好吗?
- c++ - CMake:指示编译器忽略来自 libs 文件夹的警告
- javascript - 选择要在页面加载时通过箭头键/pageup/pagedown 滚动的元素
- php - 如何查找具有一系列子值的零件编号字符串?
- sparql - 如何在 SPARQL 中转换为字符串搜索