首页 > 解决方案 > 在熊猫中计算每个名称和轮次的滞后平均值

问题描述

我需要在我的数据框中计算每个组的滞后平均值。这就是我的 df 的样子:

  name value  round
0    a     5      3
1    b     4      3
2    c     3      2
3    d     1      2
4    a     2      1
5    c     1      1
0    c     1      3
1    d     4      3
2    b     3      2
3    a     1      2
4    b     5      1
5    d     2      1

我想计算value每列的滞后均值nameround。也就是说,对于第 3 轮中的名称 a,我需要 value_mean = 1.5(因为 (1+2)/2)。当然,当 round = 1 时会有 nan 值。

我试过这个:

df['value_mean'] = df.groupby('name').expanding().mean().groupby('name').shift(1)['value'].values

但它给了一个废话:

  name value  round  value_mean
0    a     5      3         NaN
1    b     4      3         5.0
2    c     3      2         3.5
3    d     1      2         NaN
4    a     2      1         4.0
5    c     1      1         3.5
0    c     1      3         NaN
1    d     4      3         3.0
2    b     3      2         2.0
3    a     1      2         NaN
4    b     5      1         1.0
5    d     2      1         2.5

有什么想法,请问我该怎么做?我发现了这个,但它似乎与我的问题无关:Calculate the mean value using two columns in pandas

标签: pandas

解决方案


你可以这样做

# sort the values as they need to be counted
df.sort_values(['name', 'round'], inplace=True)
df.reset_index(drop=True, inplace=True)

# create a grouper to calculate the running count
# and running sum as the basis of the average
grouper= df.groupby('name')
ser_sum=   grouper['value'].cumsum()
ser_count= grouper['value'].cumcount()+1
ser_mean= ser_sum.div(ser_count)
ser_same_name= df['name'] == df['name'].shift(1)
# finally you just have to set the first entry
# in each name-group to NaN (this usually would
# set the entries for each name and round=1 to NaN)
df['value_mean']= ser_mean.shift(1).where(ser_same_name, np.NaN)

# if you want to see the intermediate products, 
# you can uncomment the following lines
#df['sum']= ser_sum
#df['count']= ser_count
df

输出:

   name  value  round  value_mean
0     a      2      1         NaN
1     a      1      2         2.0
2     a      5      3         1.5
3     b      5      1         NaN
4     b      3      2         5.0
5     b      4      3         4.0
6     c      1      1         NaN
7     c      3      2         1.0
8     c      1      3         2.0
9     d      2      1         NaN
10    d      1      2         2.0
11    d      4      3         1.5

推荐阅读