首页 > 解决方案 > 通过使用多处理查询 mongodb 生成多个图

问题描述

我想加快从 mongodb atlas 中查找数据的绘图功能。我使用了网上的例子,但是我不确定它是否是正确的实现。使用multiprocessing.Pool()似乎比没有包装要慢。我究竟做错了什么?谢谢。

jupyter笔记本输出

from pymongo import MongoClient
from matplotlib.backends.backend_svg import FigureCanvasSVG
from matplotlib.figure import Figure
import io
import multiprocessing
import time

lstOfwavelengths = list(range(220,810,10))

def build_graph_mongo_multiproc(pltcodeWithSuffix,wellID):
    client = MongoClient()
    db = client.databasename
    img = io.BytesIO()
    fig = Figure(figsize=(0.6,0.6))
    axis = fig.add_subplot(1,1,1)
    absvals = db[pltcodeWithSuffix].find({"Wavelength":wavelength})
    absvals = {k:v for k,v in absvals[0].items() if k}
    axis.plot(lstOfwavelengths,absvals)
    axis.set_title(f'{pltcodeWithSuffix}:{wellID}',fontsize=9)
    axis.title.set_position([.5, .6])
    axis.tick_params(
            which='both',
            bottom=False,
            left=False,
            labelbottom=False,
            labelleft=False)
    FigureCanvasSVG(fig).print_svg(img)
    lstOfPlts.append(img.getvalue() )

与 single 和 multiproc 函数的唯一区别是 MongoClient 在函数之外被调用一次。

标签: python-3.xmongodbmatplotlibmultiprocessing

解决方案


我发现了这篇很棒的文章:The Effective way of using multiprocessing with pymongo

使用这篇文章作为模板,我能够将计算时间从 21 秒减少到约 7.5 秒。我敢肯定,更有经验的人可以节省更多时间,但我认为这对我的水平来说已经足够了。

manager = multiprocessing.Manager()
lstOfPlots = manager.list()

def chunks(l, n):
    for i in range(0, len(l), n):
        yield l[i:i + n]

def getAllWellVals(db,pltcodeWithSuffix,wellID):
    lstOfVals = []
    for i in db[pltcodeWithSuffix].find({}, {wellID:1,'_id':0}):
        lstOfVals.append(i[wellID])
    return lstOfVals

def build_graph_mongo_multiproc(chunk,pltcodeWithSuffix):
    global lstOfPlots
    client=MongoClient(connect_string,maxPoolSize=10000)
    db = client[dbname]
    #loop over the id's in the chunk and do the plotting with each
    for wid in chunk:
        #do the plotting with document collection.find_one(id)
        img = io.BytesIO()
        fig = Figure(figsize=(0.6,0.6))
        axis = fig.add_subplot(1,1,1)
        absVals = getAllWellVals(db,pltcodeWithSuffix,wid)
        axis.plot(lstOfwavelengths,absVals)
        axis.set_title(f'{wid}',fontsize=9)
        axis.title.set_position([.5, .6])
        axis.tick_params(
                which='both',
                bottom=False,
                left=False,
                labelbottom=False,
                labelleft=False)
        FigureCanvasSVG(fig).print_svg(img)
        result = img.getvalue()
        lstOfPlots.append(result)

推荐阅读