首页 > 解决方案 > 在 for 循环中并行执行相同的函数

问题描述

使用 Python 2.7,我创建了一个示例字典和几个函数来子集该字典并遍历子集......

from itertools import islice
from multiprocessing import Process
from collections import OrderedDict

global pair_dict

pair_dict = {
    1: 'one',
    2: 'two',
    3: 'three',
    4: 'four',
    5: 'five',
    6: 'six',
    7: 'seven',
    8: 'eight'
}


global test_printer



def test_printer(start_chunk, end_chunk):

    fin_dict = OrderedDict(sorted(pair_dict.items()))
    sub_dict = dict(fin_dict.items()[start_chunk:end_chunk])

    for key, value in sub_dict.iteritems():

        print key, value

    print '-' * 50



def set_chunk_start_end_points():


    # Takes the dictionary and chunks for parallel execution.

    for i in range(2, 9, 2):

        start_chunk = i - 2
        end_chunk = i

        test_printer(start_chunk, end_chunk)

        #first = Process(target=test_printer, args=(start_chunk, end_chunk)).start()

set_chunk_start_end_points()

...我已经看到了多处理使用的示例,但似乎没有一个适合我正在尝试做的事情。示例代码创建四个子字典并串行执行它们。我正在寻找它们并行运行。

如果您注释掉该行test_printer(start_chunk, end_chunk)并取消注释它下面的行,我希望看到相同的输出,只是 Python 使用了多个线程来执行此操作。然而,现在什么也没有发生。

我究竟做错了什么?

谢谢

标签: pythonmultiprocessing

解决方案


我总是发现 pool.map 是并行执行相同功能的最简单方法。也许你会发现它很有帮助。

from itertools import islice
from multiprocessing import Pool as ProcessPool # easier to work with for this sort of thing
from collections import OrderedDict

# You were using globals wrong. But that's a separate topic.

pair_dict = {
    1: 'one',
    2: 'two',
    3: 'three',
    4: 'four',
    5: 'five',
    6: 'six',
    7: 'seven',
    8: 'eight'
}

# This only needs to be executed once. Not every time the function is called.
fin_dict = OrderedDict(sorted(pair_dict.items()))


def test_printer(chunk): # Going to make this take 1 argument. Just easier.
    start_chunk = chunk[0]
    end_chunk = chunk[1]  # All things considered this should be called chunk_end, not end_chunk

    # list for python3 compatibility
    sub_dict = dict(list(fin_dict.items())[start_chunk:end_chunk])


    # .items() for python3 compatibility
    for key, value in sub_dict.items():
        print(key, value) # Looks like you're still using Python2.7? Upgrade friend. Little support for that stuff anymore.

    print('-' * 50)



def set_chunk_start_end_points():
    # Takes the dictionary and chunks for parallel execution.
    # comment: Does it? This function takes no arguments from what I can see.
    # Think through your comments carefully.

    # Let's calculate the chunks upfront:
    chunks = [(i-2, i) for i in range(2,9,2)]

    with ProcessPool(4) as pool: # however many processes you want
        pool.map(test_printer, chunks)

set_chunk_start_end_points()

请注意,除非您想要特定的块,否则 pool.map 会为您分块。在这种情况下,它实际上是对我们的块列表进行分块!


推荐阅读