首页 > 解决方案 > 多个基于不同的数据框

问题描述

我有两个数据框:df1:

  Name   Segment Axis    1   2    3    4    5       
  Amazon 1       slope  NaN  2.5  2.5  2.5  2.5
  Amazon 1       x      0.0  1.0  2.0  3.0  4.0
  Amazon 1       y      0.0  0.4  0.8  1.2  1.6
  Amazon 2       slope  NaN  2.0  2.0  2.0  2.0
  Amazon 2       x      0.0  2.0  4.0  6.0  8.0
  Amazon 2       y      0.0  1.0  2.0  3.0  4.0

df2:

 Name  Segment Cost
Amazon   1      100
Amazon   2      112
Netflix  1      110 
Netflix  2      210

我想将第 1-5 列中“斜率”上的所有值乘以第二个数据框中的相应成本。

预期输出:

     Name   Segment Axis    1   2    3    4    5       
    Amazon 1       slope  NaN  250  250  250  250
    Amazon 1       x      0.0  1.0  2.0  3.0  4.0
    Amazon 1       y      0.0  0.4  0.8  1.2  1.6
    Amazon 2       slope  NaN  224  224  224  224
    Amazon 2       x      0.0  2.0  4.0  6.0  8.0
    Amazon 2       y      0.0  1.0  2.0  3.0  4.0

标签: pythonpandasnumpy

解决方案


尝试这个:

#merge df2 to align to df1
u = df1.merge(df2,on=['Name','Segment'],how='left')
#find columns to multiply the cost
cols = df1.columns ^ ['Name','Segment','Axis']
#multiply and assign back
df1[cols] = u[cols].mul(u['Cost'],axis=0).where(df1['Axis'].eq('slope'),df1[cols])

print(df1)

     Name  Segment   Axis    1      2      3      4      5
0  Amazon        1  slope  NaN  250.0  250.0  250.0  250.0
1  Amazon        1      x  0.0    1.0    2.0    3.0    4.0
2  Amazon        1      y  0.0    0.4    0.8    1.2    1.6
3  Amazon        2  slope  NaN  224.0  224.0  224.0  224.0
4  Amazon        2      x  0.0    2.0    4.0    6.0    8.0
5  Amazon        2      y  0.0    1.0    2.0    3.0    4.0

推荐阅读