pandas - 在熊猫中计算每个名称和轮次的滞后平均值

问题描述

我需要在我的数据框中计算每个组的滞后平均值。这就是我的 df 的样子：

  name value  round
0    a     5      3
1    b     4      3
2    c     3      2
3    d     1      2
4    a     2      1
5    c     1      1
0    c     1      3
1    d     4      3
2    b     3      2
3    a     1      2
4    b     5      1
5    d     2      1

我想计算value每列的滞后均值name和round。也就是说，对于第 3 轮中的名称 a，我需要 value_mean = 1.5（因为 (1+2)/2）。当然，当 round = 1 时会有 nan 值。

我试过这个：

df['value_mean'] = df.groupby('name').expanding().mean().groupby('name').shift(1)['value'].values

但它给了一个废话：

  name value  round  value_mean
0    a     5      3         NaN
1    b     4      3         5.0
2    c     3      2         3.5
3    d     1      2         NaN
4    a     2      1         4.0
5    c     1      1         3.5
0    c     1      3         NaN
1    d     4      3         3.0
2    b     3      2         2.0
3    a     1      2         NaN
4    b     5      1         1.0
5    d     2      1         2.5

有什么想法，请问我该怎么做？我发现了这个，但它似乎与我的问题无关：Calculate the mean value using two columns in pandas

标签： pandas

你可以这样做

# sort the values as they need to be counted
df.sort_values(['name', 'round'], inplace=True)
df.reset_index(drop=True, inplace=True)

# create a grouper to calculate the running count
# and running sum as the basis of the average
grouper= df.groupby('name')
ser_sum=   grouper['value'].cumsum()
ser_count= grouper['value'].cumcount()+1
ser_mean= ser_sum.div(ser_count)
ser_same_name= df['name'] == df['name'].shift(1)
# finally you just have to set the first entry
# in each name-group to NaN (this usually would
# set the entries for each name and round=1 to NaN)
df['value_mean']= ser_mean.shift(1).where(ser_same_name, np.NaN)

# if you want to see the intermediate products, 
# you can uncomment the following lines
#df['sum']= ser_sum
#df['count']= ser_count
df

输出：

   name  value  round  value_mean
0     a      2      1         NaN
1     a      1      2         2.0
2     a      5      3         1.5
3     b      5      1         NaN
4     b      3      2         5.0
5     b      4      3         4.0
6     c      1      1         NaN
7     c      3      2         1.0
8     c      1      3         2.0
9     d      2      1         NaN
10    d      1      2         2.0
11    d      4      3         1.5

pandas - 在熊猫中计算每个名称和轮次的滞后平均值

问题描述

解决方案

推荐阅读