首页 > 解决方案 > multiprocessing pool with a dictionary as one of the arguments?

问题描述

Is it possible to use Pool.map() on a function that contains an empty dictionary as one of its arguments? I am new to multiprocessing and want to parallise a web-scraping function. I tried following the example from this site however it doesn't include a dictionary as one of the arguments. The multiprocess function works (it prints out the search result), however it does not append to the dictionary, after completing the process the dictionary is still empty. Looks like I have to use Manager() however I don't know how to implement it. use of Manager() Thanks for help.

from functools import partial
from multiprocessing import Pool
from bs4 import BeautifulSoup as soup

count = 1
outerDict = dict()
emptyList = []
lstOfItems = ['Valsartan','Estrace','Norvasc','Combivent',
'Fluvirin','Kariva','Natrl','Foxamax','Vilanterol','Catapres']

def process_search():
     '''a function that scrapes a site; the outerDict and emptyLst will
become populated as it scrapes the site for each item'''

def callSrch(item,outerDict,emptyList,count):
    searchlink = 'http://www.asite.com'
    uClient=ureq(searchlink+item)
    pagehtml = uClient.read()
    soupPage_ = soup(pagehtml,'html.parser')
    process_search(item,soupPage_,outerDict,count,emptyList)

with Pool() as p:
    prfx = partial(callSrch,outerDict=outerDict,emptyList=emptyList,count=count)
    p.map(prfx, lstOfItems)

标签: python-3.xweb-scrapingmultiprocessing

解决方案


推荐阅读