首页 > 解决方案 > 根据熊猫中的金额制作列标志

问题描述

我有以下实验数据框,无论他们是否获胜,以及提供的计划数量。

每个experiment_id 可以提供许多计划。我想将提供的最大计划标记为“更多计划”,将任何较低金额标记为“减少计划”。假设所有实验 id 都是多余的(不是唯一的)。我怎样才能制作这面旗帜?

编辑 我意识到以上对我的问题没有意义。当plans_offered 数字既不是最高也不是最低时,我想要一个显示更多/更少计划的新标志。

输入

experiment_id   winner  plans_offered
1               1       3
1               0       1
2               1       3
2               0       7
3               1       6
3               0       5              
4               1       2
4               0       3
4               0       4
5               1       5
5               0       4

预期产出

experiment_id   winner  plans_offered  flag
1               1       3              More Plans
1               0       1              Less Plans
2               1       3              Less Plans
2               0       7              More Plans
3               1       6              More Plans
3               0       5              Less Plans
4               1       2              Less Plans
4               0       3              More/Less Plans
4               0       4              More Plans
5               1       5              More Plans
5               0       4              Less Plans

标签: pythonpandas

解决方案


检查transform然后映射

df['new'] = df.groupby('experiment_id')['plans_offered'].transform(lambda x : x ==x.max()).map({True:'More Plans',False:'Less Plans'})
df
    experiment_id  winner  plans_offered         new
0               1       1              3  More Plans
1               1       0              1  Less Plans
2               2       1              3  Less Plans
3               2       0              7  More Plans
4               3       1              6  More Plans
5               3       0              5  Less Plans
6               4       1              3  Less Plans
7               4       0              3  Less Plans
8               4       0              4  More Plans
9               5       1              5  More Plans
10              5       0              4  Less Plans

更新

g = df.groupby('experiment_id')['plans_offered']
cond1 = df['plans_offered'] == g.transform('max')
cond2 = df['plans_offered'] == g.transform('min')
df['new'] = np.select([cond1,cond2],['More Plans','Less Plans'],default = 'More/Less Plans')
df
    experiment_id  winner  plans_offered              new
0               1       1              3       More Plans
1               1       0              1       Less Plans
2               2       1              3       Less Plans
3               2       0              7       More Plans
4               3       1              6       More Plans
5               3       0              5       Less Plans
6               4       1              2       Less Plans
7               4       0              3  More/Less Plans
8               4       0              4       More Plans
9               5       1              5       More Plans
10              5       0              4       Less Plans

推荐阅读