首页 > 解决方案 > 按一列分组,但将另外两列相加并计算第三列

问题描述

我的

df_RFQ_by_Salesperson = df[
                          (df['state'].str.contains('Done'))
                          ][['sales_person_name2',
                             'rfq_qty',
                             'rfq_qty_CAD_Equiv',
                             'state'
                            ]].copy()

display(df_RFQ_by_Salesperson.head(3))

    sales_person_name2  rfq_qty     rfq_qty_CAD_Equiv   state
14  AY                 200000.0     2.568713e+05        Done
22  AY                 1000000.0    1.284357e+06        Done
28  YJJ               25000000.0    4.420085e+07        Done

我想groupby打开df_RFQ_by_Salespersonsum打开rfq_qtysum打开rfq_qty_CAD_Equiv,然后添加一个count基于state的百分比列rfq_qty_CAD_Equiv。我已经弄清楚了总和和百分比列,但我不确定如何循环计算状态数?

df_RFQ_by_Salesperson = df_RFQ_by_Salesperson.rename(columns={'state':'Done Trades'}, level=0) # rename the column header in the groupby
df_RFQ_by_Salesperson = df_RFQ_by_Salesperson.groupby(['sales_person_name2'])['rfq_qty','rfq_qty_CAD_Equiv'].sum() 
Total_Done_Volume = df_RFQ_by_Salesperson['rfq_qty_CAD_Equiv'].sum()
df_RFQ_by_Salesperson['Percentage'] = df_RFQ_by_Salesperson['rfq_qty_CAD_Equiv'].div(Total_Done_Volume)

display(df_RFQ_by_Salesperson.sort_values('Percentage',ascending=False))

sales_person_name2  rfq_qty     rfq_qty_CAD_Equiv   Percentage  Count of State      
MP                  214400000.0 3.045802e+08        0.258089        ?
AC                  228800000.0 2.648099e+08        0.224390        ?
YJJ                 202500000.0 2.490527e+08        0.211038        ?
RW                  129000000.0 1.693008e+08        0.143459        ?
AY                  118366000.0 1.189635e+08        0.100805        ?
RL                  78617000.0  7.342725e+07        0.062219        ?

是否可以与一个 groupby 中的总和一起进行计数?

标签: pythonpandasdataframepandas-groupby

解决方案


您可以通过指定从列名到函数的映射来聚合具有不同函数的多个列:

out = df.groupby('sales_person_name2').agg(
 {'rfq_qty': 'sum', 'rfq_qty_CAD_Equiv': 'sum', 'state': 'size'}
)

然后分别计算百分比并分配给百分比列

out['percentage'] = out.rfq_qty_CAD_Equiv / out.rfq_qty_CAD_Equiv.sum()

推荐阅读