首页 > 解决方案 > Python groupby, cumsum and max to calculate month-end balances

问题描述

I have data that I have massaged to look like this.

         Date    Amount Month   Balance
0    9/4/2018  32000.00     9  32000.00
1   9/30/2018     29.59     9  32029.59
2   10/1/2018     34.05    10  32063.64
3  10/31/2018  -1000.00    10  31063.64
4   11/1/2018   1500.00    11  32563.64
5  11/30/2018     33.06    11  32596.70
6   12/1/2018  -2000.00    12  30596.70
7  12/31/2018     34.26    12  30630.96

I need to calculate the balance at the end of each month and the highest month-end balance. I have tried various combinations of groupby, cumsum and max, but I am not getting the expected results.

Here is what I have so far:

month_end_balance = yearly_df.groupby('Month')['Amount'].cumsum()
max_month_end_balance = month_end_balance.max()

I am expecting the month_end_balance to be:

9   32029.59
10  31063.64
11  32596.70
12  30630.96

I am expecting the max_month_end_balance to be 32596.70

标签: pythonpandas

解决方案


首先将Date列转换为datetime

df.Date=pd.to_datetime(df.Date,format='%m/%d/%Y')

然后 :

m=(df.assign(cum_Amount=df.Amount.cumsum()).
  groupby(df.Date.dt.month)['cum_Amount'].max().reset_index())
print(m)

   Date  cum_Amount
0     9    32029.59
1    10    31063.64
2    11    32596.70
3    12    30630.96

编辑似乎您已经有了余额,并且您只想过滤月末日期,请使用:

from pandas.tseries.offsets import MonthEnd
df[df.Date.eq(df.Date+MonthEnd(0))]

        Date   Amount  Month   Balance
1 2018-09-30    29.59      9  32029.59
3 2018-10-31 -1000.00     10  31063.64
5 2018-11-30    33.06     11  32596.70
7 2018-12-31    34.26     12  30630.96

推荐阅读