首页 > 解决方案 > 仅当标记列为 1 时,如何计算滚动平均值

问题描述

我只想在标记列为 1 时计算滚动平均值。这是一个小例子,但现实世界的数据是海量的,需要高效。

df = pd.DataFrame()
df['Obs']=[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]
df['Marker']=[0,0,0,0,1,0,0,0,0,1,0,0,0,0,1]
df['Mean']=(df.Obs.rolling(5).mean())

How can I create a Desired column like this:

df['Desired']=[0,0,0,0,3.0,0,0,0,0,8.0,0,0,0,0,13.0]

print(df)

    Obs  Marker  Mean  Desired
0     1       0   NaN      0.0
1     2       0   NaN      0.0
2     3       0   NaN      0.0
3     4       0   NaN      0.0
4     5       1   3.0      3.0
5     6       0   4.0      0.0
6     7       0   5.0      0.0
7     8       0   6.0      0.0
8     9       0   7.0      0.0
9    10       1   8.0      8.0
10   11       0   9.0      0.0
11   12       0  10.0      0.0
12   13       0  11.0      0.0
13   14       0  12.0      0.0
14   15       1  13.0     13.0

标签: python-3.xpandasnumpy

解决方案


你很接近,只需要一个where

df['Mean']= df.Obs.rolling(5).mean().where(df['Marker']==1, 0)

输出:

    Obs  Marker  Mean
0     1       0   0.0
1     2       0   0.0
2     3       0   0.0
3     4       0   0.0
4     5       1   3.0
5     6       0   0.0
6     7       0   0.0
7     8       0   0.0
8     9       0   0.0
9    10       1   8.0
10   11       0   0.0
11   12       0   0.0
12   13       0   0.0
13   14       0   0.0
14   15       1  13.0

推荐阅读