首页 > 解决方案 > 如何获取数据框列中每个值的前 n 个值和后 n 个值的平均值

问题描述

解释这一点的最简单方法应该是一个例子。

想象一下以下数据框:

a  b 
1  5   
2  4
3  2
4  2
5  4
6  3
7  2
8  1
9  0

我希望能够获得 b 列中每个值的前 3 个值和后 3 个值的平均值。所以它应该看起来像这样

a  b   c
1  5   
2  4   
3  2
4  2  3.3
5  4  2.3
6  3  1.83
7  2  
8  1
9  0

任何帮助表示赞赏

谢谢

标签: pythonpandasdataframeaverage

解决方案


这是我使用 numpy 帮助的解决方案:
(df 是您的示例数据框)

length = df.shape[0]   # Number of rows in the dataframe
windowSize = 3         # Since we are looking at top 3 and bottom 3 values 

for i in range(windowSize, length-windowSize):                   
    # Get the indexes (0-based) of the top 3 values 
    top3Idxs = np.arange(i - windowSize, i)
    bottom3Idxs = np.arange(i + 1, i + 1 + windowSize)
    
    # Get the values in column b at the proper indices
    top3Vals = df.b.to_numpy()[top3Idxs]
    bottom3Vals = df.b.to_numpy()[bottom3Idxs]
    
    # Find the average of the top3Vals and bottom3Vals
    avg = np.mean(np.concatenate((top3Vals, bottom3Vals)))
    
    # Set the average at the proper index in column c
    df.at[i, 'c'] = avg

推荐阅读