首页 > 解决方案 > Python 中的累积持续时间构建

问题描述

我有两个 Pandas DataFrame。df1包含构建因素并df2包含通过不同期限累积的美元:

import numpy as np
import pandas as pd

df1 = pd.DataFrame({'dur': [1, 2, 3, 4],
                    'build': [35, .96, .25, .10]})
df2 = pd.DataFrame({'dur': [1, 3, 4, 3, 4, 4],
                    'out': [1, 1, 1, 2, 2, 3],
                    'ytd_dol': [110, 3600, 6302, 1300, 3450, 1550]})
print(df1)
#   dur  build
#0    1  35.00
#1    2   0.96
#2    3   0.25
#3    4   0.10

print(df2)
#   dur  out  ytd_dol
#0    1    1      110
#1    3    1     3600
#2    4    1     6302
#3    3    2     1300
#4    4    2     3450
#5    4    3     1550

我想要的是在每个组合中投射新ytd_dol值,使用来自. 为每一行完成此操作的手动计算如下:df2duroutdf1

使用上面的等式,df2将使用新列进行更新,如下所示:

print(df2)
#   dur  out  ytd_dol  proj_ytd_dol
#0    1    1      110       3960.00
#1    3    1     3600       4500.00
#2    4    1     6302       6932.20
#3    3    2     1300       2860.00
#4    4    2     3450       4398.75
#5    4    3     1550       3447.20

标签: pythonpandasdataframe

解决方案


终于明白你的逻辑了。

下面的代码应该给出你想要的输出。

result = np.select([df2['out']==1, 
                    df2['out']==2,
                    df2['out']==3], 
                   [df2['ytd_dol'] \
                    + df2['ytd_dol'] \
                       .mul(df2['dur'].map(df1.set_index('dur')['build'])), 
                    df2['ytd_dol']
                    + df2['ytd_dol'] \
                       .mul((df2['dur']-1).map(df1.set_index('dur')['build'])) \
                    + df2['ytd_dol'] \
                       .mul((df2['dur']-1).map(df1.set_index('dur')['build'])) \
                       .mul((df2['dur']).map(df1.set_index('dur')['build'])),
                    df2['ytd_dol'] \
                    + df2['ytd_dol'] \
                       .mul((df2['dur']-2).map(df1.set_index('dur')['build'])) \
                    + df2['ytd_dol'] \
                       .mul((df2['dur']-2).map(df1.set_index('dur')['build'])) \
                       .mul((df2['dur']-1).map(df1.set_index('dur')['build'])) \
                    + df2['ytd_dol'] \
                       .mul((df2['dur']-2).map(df1.set_index('dur')['build'])) \
                       .mul((df2['dur']-1).map(df1.set_index('dur')['build'])) \
                       .mul((df2['dur']).map(df1.set_index('dur')['build']))])

df2['proj_ytd_dol'] = result

输出

print(df2)
#   dur  out  ytd_dol  proj_ytd_dol
#0    1    1      110       3960.00
#1    3    1     3600       4500.00
#2    4    1     6302       6932.20
#3    3    2     1300       2860.00
#4    4    2     3450       4398.75
#5    4    3     1550       3447.20

推荐阅读