首页 > 解决方案 > 并行 Python for 循环迭代函数参数列表

问题描述

我需要并行化一个 Pythonfor循环,在该循环中,每次迭代都会调用一个返回两个结果的函数(接受两个参数),然后将这些结果附加到两个不同的列表中。该for循环遍历两个参数列表。

所以说我有以下代码:

def my_f(a, b):
    res1 = a + b
    res2 = a * b
    return res1, res2
    
# lists of arguments
args1 = [1, 2, 3, 4]  
args2 = [5, 6, 7, 8]
    
res_list1, res_list2 = [], []
for i in range(len(args1)):  # loop to parallelize
    res1, res2 = my_f(args1[i], args2[i])
    res_list1.append(res1)
    res_list2.append(res2)

结果应该是

res_list1 = [6, 8, 10, 12]
res_list2 = [5, 12, 21, 32]

我将如何让它并行运行?

我知道在 C/C++ 中可以只使用#pragma omp for来获得并行。Python中有类似的东西吗?

我在 Linux 上使用 python 3.8.5,但我需要让它在任何操作系统上工作。

标签: pythonlistfor-loopparallel-processing

解决方案


You can use Python's multiprocessing.Pool feature to achieve your result. Here's the link from the docs (https://docs.python.org/3/library/multiprocessing.html#using-a-pool-of-workers) However, instead of using map, you are going to want to use starmap because you are passing more than one argument. Here is how I would do it:

from multiprocessing import Pool

def my_f(a, b):
    res1 = a + b
    res2 = a * b
    return res1, res2
   

if __name__ == '__main__':
    args1 = [1, 2, 3, 4]  
    args2 = [5, 6, 7, 8]
        
    
    res = []
    with Pool(processes=4) as pool:
        res = pool.starmap(my_f, zip(args1,args2))

    res_list1 = [r[0] for r in res]
    res_list2 = [r[1] for r in res]

Firstly, notice that the main code is in the if __name__ == '__main__': block. This is super important to Python parallelism because Python will actually create new processes and not threads. Anything in the if block will only be run by the main process.

Secondly, I converted your two lists into a single iterable using the zip method. This is important because the starmap function must have the arguments in the form of a tuple.

Finally, the last few lines convert the res list into two lists like your example had. That is because the res output is actually a list of tuples.


推荐阅读