python - 如何使用多处理获取数字列表中的最大数字
问题描述
我有一个随机数列表,我想使用multiprocessing获得最大的数字。
这是我用来生成列表的代码:
import random
randomlist = []
for i in range(100000000):
n = random.randint(1,30000000)
randomlist.append(n)
要使用串行过程获得最大数量:
import time
greatest = 0 # global variable
def f(n):
global greatest
if n>greatest:
greatest = n
if __name__ == "__main__":
global greatest
t2 = time.time()
greatest = 0
for x in randomlist:
f(x)
print("serial process took:", time.time()-t2)
print("greatest = ", greatest)
这是我尝试使用多处理获得最大数量:
from multiprocessing import Pool
import time
greatest = 0 # the global variable
def f(n):
global greatest
if n>greatest:
greatest = n
if __name__ == "__main__":
global greatest
greatest = 0
t1 = time.time()
p = Pool() #(processes=3)
result = p.map(f,randomlist)
p.close()
p.join()
print("pool took:", time.time()-t1)
print("greatest = ", greatest)
这里的输出是0。很明显没有全局变量。如何在不影响性能的情况下解决此问题?
解决方案
正如@Barmar 所建议的,将你randomlist
分成块然后处理每个块的局部最大值,最后计算全局最大值local_maximum_list
:
import multiprocessing as mp
import numpy as np
import random
import time
CHUNKSIZE = 10000
def local_maximum(l):
m = max(l)
print(f"Local maximum: {m}")
return m
if __name__ == '__main__':
randomlist = np.random.randint(1, 30000000, 100000000)
start = time.time()
chunks = (randomlist[i:i+CHUNKSIZE]
for i in range(0, len(randomlist), CHUNKSIZE))
with mp.Pool(mp.cpu_count()) as pool:
local_maximum_list = pool.map(local_maximum, chunks)
print(f"Global maximum: {max(local_maximum_list)}")
end = time.time()
print(f"MP Elapsed time: {end-start:.2f}s")
表现
随机列表的创建如何影响多处理的性能非常有趣
Scenario 1:
randomlist = np.random.randint(1, 30000000, 100000000)
MP Elapsed time: 1.63s
Scenario 2:
randomlist = np.random.randint(1, 30000000, 100000000).tolist()
MP Elapsed time: 6.02s
Scenario 3
randomlist = [random.randint(1, 30000000) for _ in range(100000000)]
MP Elapsed time: 7.14s
Scenario 4:
randomlist = list(np.random.randint(1, 30000000, 100000000))
MP Elapsed time: 184.28s
Scenario 5:
randomlist = []
for _ in range(100000000):
n = random.randint(1, 30000000)
randomlist.append(n)
MP Elapsed time: 7.52s
推荐阅读
- javascript - 路径和深度链接的可选参数
- dns - 使用 Google PageSpeed Insights,它给我一个 404 错误,即使我的网站加载得很好
- database - 我的 ClickHouse 服务器突然停止允许连接
- google-apps-script - 尝试根据复选框选择发送电子邮件
- sql - SQL Server 存储过程:未保存在“存储过程”文件夹下,无法识别参数
- gradle - 是否有可能有一个远程 gradle.properties 文件
- windows - 如何远程执行需要活动桌面的 Powershell 脚本
- python-3.x - 理解字符串到二进制函数
- c# - 将属性限制为属性类型
- javascript - JS 如何像 Timer 一样减去时间(包括毫秒)