首页 > 解决方案 > 如何在熊猫中执行条件分组计算并前向填充零值

问题描述

我有以下数据框:

import pandas as pd
#Create DF
d = { 
     'Date': ['1/01/2021','2/01/2021','3/01/2021','4/01/2021','5/01/2021','6/01/2021','7/01/2021','8/01/2021','9/01/2021','10/01/2021','11/01/2021','12/01/2021','13/01/2021',
'14/01/2021','15/01/2021','16/01/2021'],
    'Name': ['Joe','Joe','Joe','Joe','Joe','Joe','Joe','Joe','Joe','John','John','John','John','John','John','John'],
    'Criteria':[30,35,2.5,3,6,15,20,1.5,4,25,50,75,2,4,9,2],
    'Sum' : [-1,-1,1.5,-1,5,-1,-1,-1,3,-1,-1,-1,-1,3,-1,-1],
    'Sum Divided by Count where Criteria <5' : [0.000,0.000,1.500,0.250,0.000,0.000,0.000,-0.167,0.625,0.000,0.000,0.000,-1.000,1.000,0.000,0.333]
    
    }

              
df = pd.DataFrame(data=d)
df['Date'] = pd.to_datetime(df.Date,format='%d/%m/%Y')
df

在此处输入图像描述

最后一列计算小于的列的总和除以小于Sum的行数。并按 分组。红色突出显示符合条件的行。Criteria5Criteria5Name

我还想做的是向所有行添加前向填充,但将初始行保持为零。

以上所有在熊猫中都可能吗?

我的最终结果应该是:

在此处输入图像描述

任何帮助将不胜感激!

标签: pythonpandas

解决方案


将下面的值替换为 in中5的缺失值,然后每组要求将总和除以计数,如果值不匹配,则将它们替换为in :SumDataFrame.assignExpanding.mean0Series.where

m = df['Criteria'] < 5
df['new'] = (df.assign(Sum = df['Sum'].where(m))
               .groupby('Name')['Sum']
               .expanding()
               .mean()
               .reset_index(level=0, drop=True)
               .where(m)
               .groupby(df['Name'])
               .ffill()
               .fillna(0))

print (df)
         Date  Name  Criteria  Sum  Sum Divided by Count where Criteria <5  \
0  2021-01-01   Joe      30.0 -1.0                                   0.000   
1  2021-01-02   Joe      35.0 -1.0                                   0.000   
2  2021-01-03   Joe       2.5  1.5                                   1.500   
3  2021-01-04   Joe       3.0 -1.0                                   0.250   
4  2021-01-05   Joe       6.0  5.0                                   0.000   
5  2021-01-06   Joe      15.0 -1.0                                   0.000   
6  2021-01-07   Joe      20.0 -1.0                                   0.000   
7  2021-01-08   Joe       1.5 -1.0                                  -0.167   
8  2021-01-09   Joe       4.0  3.0                                   0.625   
9  2021-01-10  John      25.0 -1.0                                   0.000   
10 2021-01-11  John      50.0 -1.0                                   0.000   
11 2021-01-12  John      75.0 -1.0                                   0.000   
12 2021-01-13  John       2.0 -1.0                                  -1.000   
13 2021-01-14  John       4.0  3.0                                   1.000   
14 2021-01-15  John       9.0 -1.0                                   0.000   
15 2021-01-16  John       2.0 -1.0                                   0.333   

         new  
0   0.000000  
1   0.000000  
2   1.500000  
3   0.250000  
4   0.250000  
5   0.250000  
6   0.250000  
7  -0.166667  
8   0.625000  
9   0.000000  
10  0.000000  
11  0.000000  
12 -1.000000  
13  1.000000  
14  1.000000  
15  0.333333  

推荐阅读