首页 > 解决方案 > 如何通过多个规则加快形成新列

问题描述

我即将测算这几个月汽车经销商的销售情况,

我需要计算过去 6 和 12 个月的平均值和总和,

我在 python 中执行此操作的方式是通过以下代码。

问题是当数据相对较大时处理需要很长时间,当数据较小时它工作得很好。

有没有什么天才的方法可以加快速度?

def ts(x,y):

    d[x+'sum12']=0
    d[x+'sum6']=0
    d[x+'year_flag']=0
    d[x+'sum_last12']=0
    d[x+'sum_last6']=0

    for i in range(0,len(d)):
        d.iloc[i][x+'sum6']=d.loc[(d.iloc[i]['PERIOD_ID']>=d['PERIOD_ID'])&\
                             (d['PERIOD_ID']>=d.iloc[i]['last6'])\
                             &(d.iloc[i]['RTL_DLR_ID']==d['RTL_DLR_ID'])]    [y].sum()
        d.iloc[i][x+'sum12']=d.loc[(d.iloc[i]['PERIOD_ID']>=d['PERIOD_ID'])&\
                             (d['PERIOD_ID']>=d.iloc[i]['last12'])\
                             &(d.iloc[i]['RTL_DLR_ID']==d['RTL_DLR_ID'])][y].sum()
        d.iloc[i][x+'sum_last6']=d.loc[(d['PERIOD_ID']>=(d.iloc[i]['last6']-100))&\
                             (d.iloc[i]['last12']>=d['PERIOD_ID'])\
                             &(d.iloc[i]['RTL_DLR_ID']==d['RTL_DLR_ID'])][y].sum()

        d.iloc[i][x+'sum_last12']=d.loc[(d['PERIOD_ID']>=(d.iloc[i]['PERIOD_ID']-100))&\
                             (d.iloc[i]['last12']>=d['PERIOD_ID'])\
                             &(d.iloc[i]['RTL_DLR_ID']==d['RTL_DLR_ID'])][y].sum()`enter code here`
        d.iloc[i][x+'year_flag']=d.iloc[i]['MTD_SAL_VOL_MKR_NUM']-d.loc[(d.iloc[i]['last12']==d['PERIOD_ID'])&\
                                                     (d.iloc[i]['RTL_DLR_ID']==d['RTL_DLR_ID'])][y].sum()

标签: pythonpandasdataframe

解决方案


推荐阅读