首页 > 解决方案 > 我不确定如何将这个 for 循环与多处理模块并行化

问题描述

我想减少使用多处理完成 for 循环所需的时间,但我不确定如何明确执行它,因为我没有看到可以应用于此代码的模块的任何明确的基本使用模式。

    allLines = fileRead.readlines()
    allLines = [x.strip() for x in allLines]
    for i in range (0,len(allLines)):
        currentWord = allLines[currentLine]
        currentLine += 1
        currentURL = URL+currentWord
        uClient = uReq(currentURL)
        pageHTML = uClient.read()
        uClient.close()
        pageSoup = soup(pageHTML,'html.parser')
        pageHeader = str(pageSoup.h1)
        if 'Sorry!' in pageHeader:
            with open(fileA,'a') as fileAppend:
                fileAppend.write(currentWord + '\n')
            print(currentWord,'available')
        else:
            print(currentWord,'taken')

编辑:新代码,但它仍然坏了......

allLines = fileRead.readlines()
allLines = [x.strip() for x in allLines]
def f(indexes, allLines):
    for i in indexes:
        currentWord = allLines[currentLine]
        currentLine += 1
        currentURL = URL+currentWord
        uClient = uReq(currentURL)
        pageHTML = uClient.read()
        uClient.close()
        pageSoup = soup(pageHTML,'html.parser')
        pageHeader = str(pageSoup.h1)
        if 'Sorry!' in pageHeader:
            with open(fileA,'a') as fileAppend:
                fileAppend.write(currentWord + '\n')
            print(currentWord,'available')
        else:
            print(currentWord,'taken')
for i in range(threads):
    indexes = range(i*len(allLines), i*len(allLines)+threads, 1)
    Thread(target=f, args=(indexes, allLines)).start()

标签: pythonpython-multiprocessing

解决方案


  • 将代码放入函数中
  • 拆分索引
  • 启动线程
from threading import Thread

THREADS = 10

allLines = fileRead.readlines()
allLines = [x.strip() for x in allLines]

def f(indexes, allLines):
    #This entire for loop needs to be parallelized
    for i in indexes:
        currentWord = allLines[currentLine]
        currentLine += 1
        currentURL = URL+currentWord
        uClient = uReq(currentURL)
        pageHTML = uClient.read()
        uClient.close()
        pageSoup = soup(pageHTML,'html.parser')
        pageHeader = str(pageSoup.h1)
        if 'Sorry!' in pageHeader:
            with open(fileA,'a') as fileAppend:
                fileAppend.write(currentWord + '\n')
            print(currentWord,'available')
        else:
            print(currentWord,'taken')

for i in range(THREADS):
  indexes = range(i*len(allLines), i*len(allLines)+THREADS, 1)
  Thread(target=f, args=(indexes, allLines)).start()

推荐阅读