首页 > 解决方案 > 我有一些代码在 2 秒内运行 100 次迭代,在 8 秒内运行 1000 次,在 11 分钟内运行 10,000 次

问题描述

我是一个业余程序员,这只是我为自己设定的一个小项目。我知道我很可能在这段代码中有一些东西效率低到对小循环无关紧要,但是当我扩大它时会变得复杂。任何建议,将不胜感激。

def RndSelection(ProjMatrix):
    
    
    percentiles = [0,10,20,25,30,40,50,60,70,75,80,90,99]
    results = []
    
    
    for row in ProjMatrix.itertuples():
        
        x = npr.randint(1,100)
        
        for p in range(3,16):
            
            if  p < 15:
                a = percentiles[p-3]
                b = percentiles[p-2]
                
                if x in range (a,b):
                
                                   
                    factor = (b-x)/(b-a)
                    r = round((row[p]*factor)+((row[p+1])*(1-factor)),2)
                    break
            else:
                r = row[p]

        results.append(r)
        
    thisrun = pd.DataFrame(results)
    
        
    return(thisrun)
                    

def main():

    ts = datetime.datetime.now()
    print ('Run Started: ', ts)    

    Matrix = SetMatrix()
    Outcome = Matrix['player_id']

    with concurrent.futures.ProcessPoolExecutor() as executor:
        results = [executor.submit(RndSelection,Matrix) for _ in range(10000)]

        for f in concurrent.futures.as_completed(results):
            thisrun = f.result()
            Outcome = pd.concat([Outcome,thisrun],axis=1)




    print(Outcome)

    ts = datetime.datetime.now()
    print('Run Completed: ', ts)


if __name__ == '__main__':
    main()

标签: python-3.xoptimizationmultiprocessing

解决方案


因此,正如 Jérôme 指出的那样,答案是 concat 的迭代。

将输出移动到列表列表,然后只连接一次,将 10,000 次交互的运行时间提高到 8 秒,将 100,000 次迭代的运行时间提高到 2 分 34 秒。

def RndSelection(ProjMatrix):
    
    
    percentiles = [0,10,20,25,30,40,50,60,70,75,80,90,99]
    results = []
    r = ""
    
    for row in ProjMatrix.itertuples():
        
        x = npr.randint(1,100)
        
        
        for p in range(3,16):
            
            if  p < 15:
                a = percentiles[p-3]
                b = percentiles[p-2]
                
                if x in range (a,b):
                
                                   
                    factor = (b-x)/(b-a)
                    r = round((row[p]*factor)+((row[p+1])*(1-factor)),2)
                    break
            else:
                r = row[p]

        results.append(r)
        

    
        
    return results
                    




def main():

    ts = datetime.datetime.now()
    print ('Run Started: ', ts)    

    Matrix = SetMatrix()
    runs = 100000
    s = 0
    Outcome = pd.DataFrame(Matrix['player_id'])
    
    thisrun = np.empty((runs,0)).tolist()

    with concurrent.futures.ProcessPoolExecutor() as executor:
        results = [executor.submit(RndSelection,Matrix) for _ in range(runs)]

        for f in concurrent.futures.as_completed(results):

            thisrun[s]=f.result()
            s += 1

    allruns = pd.DataFrame(thisrun).transpose()
    Outcome = pd.concat([Outcome,allruns],axis=1)




    ts = datetime.datetime.now()
    print('Run Completed: ', ts)

if __name__ == '__main__':
    main()

推荐阅读