首页 > 解决方案 > Pandas Groupby Lambda 函数多个条件/列

问题描述

我正在尝试创建一个将 df 按Deal和分组的新列,Month并将百分比(9%)应用于该Amount列。如果某个特定月份的所有AmountDeal加起来为 20,000,则将百分比应用于Amount; 否则,如果TYPEMONTHLY且个人Amount至少为 1500,则将百分比应用于Amount; 否则,乘以 0。

df.groupby(['Deal', 'Month'])["Amount"].apply(
    lambda x: x.sum() * 0.09 if x.sum() >= 20000 else (
        x * 0.09 if x >= 1500 and x['TYPE'] == 'MONTHLY' else 0
    )
)

这是我尝试过的,但不断收到诸如ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().ORKeyError: ('TYPE', u'occurred at index 0')等错误。我也尝试过使用 transform 而不是 apply 。非常感谢任何帮助。

如果我的分组 DF 看起来像 + Desired Column

   Deal         TYPE    Month        Amount   Desired Column
0   Com A   ANNUAL  April   10021.34   0
1   Com A   MONTHLY April   35.86 .    0
2   Com B   MONTHLY April   11150.05   1,003.50
3   Com B   ANNUAL  July    661.65     0
4   Com B   ANNUAL  August  303.63     0
5   Com C   ANNUAL  April   25624.59   2,306.21
6   Com D   ANNUAL  June    27309.26   2,457.83  
7   Com D   ANNUAL  July    0.00       0
8   Com D   ANNUAL  August  0.00       0
9   Com E   ANNUAL  April   10.65      0
10  Com E   MONTHLY May     0.00       0
11  Com E   ANNUAL  May     18716.70   1,684.5
12  Com E   MONTHLY June    0.00       0
13  Com E   ANNUAL  June    606.49     0
14  Com E   MONTHLY July    0.00       0
15  Com E   MONTHLY July    8890.17    800.11
16  Com E   MONTHLY August  4000       0
17  Com E   ANNUAL  August  16000      1,800
18  Com E   ANNUAL  September 2157.34  0
19  Com E   ANNUAL  October 3025.24    0

df

标签: pythonpandaslambdagroup-by

解决方案


groupby在这种情况下,您不需要 a 。有几种方法可以做到这一点,概念上最简单的方法是首先根据是每月金额还是每年金额来计算阈值

df['Threshold'] = (df.TYPE=='ANNUAL')*20000 + (df.TYPE=='MONTHLY')*1500

然后你可以根据是否达到阈值来计算金额

df['Desired Amount'] = (df.Amount>df.Threshold)*0.09*df.Amount

但这在这里有效,因为您没有针对同一交易、月份和类型的多行。如果你这样做了,那么你首先需要 groupby 来聚合所有这些

df = df.groupby(['Deal','Month','TYPE']).sum()
df.reset_index(inplace=True)

然后你可以按照上面的方法进行。


推荐阅读