首页 > 解决方案 > Column by group 中的值之间的差异

问题描述

嗨,我需要实现这样的目标:

grp value   diff
1   10       NA  # diff[1] = value[2]-value[0] of grp = 1
1   15       10  # diff[2] = value[3]-value[1] of grp = 1
1   20       -5  # diff[3] = value[4]-value[2] of grp = 1
1   10       NA  # diff[4] = value[5]-value[3] of grp = 1
2   25       NA  # diff[5] = value[6]-value[4] of grp = 2
2   30       10  # diff[6] = value[7]-value[5] of grp = 2
2   35       NA  # diff[7] = value[8]-value[6] of grp = 2

我尝试过使用类似的函数shiftlag但无法获得这种类型的解决方案,我将前面的值的差异减去它们,它是diff[i] = value[i+1] - value[i-1]

使用for loop遇到错误,那么有没有更好的方法来做到这一点?

标签: rdplyr

解决方案


我认为您在描述差异值时有错字。但是,如果您希望它diff[i]成为value[i+1]- value[i-1],您可以同时使用leadlagindplyr

library(dplyr)
df %>% group_by(grp) %>% mutate(diff = lead(value) -lag(value))

# A tibble: 7 x 3
# Groups:   grp [2]
    grp value  diff
  <dbl> <dbl> <dbl>
1     1    10    NA
2     1    15    10
3     1    20    -5
4     1    10    NA
5     2    25    NA
6     2    30    10
7     2    35    NA

编辑:绝对差异

如果您需要绝对差异,您可以执行以下操作:

df %>% group_by(grp) %>% mutate(diff = abs(lead(value) -lag(value)))

# A tibble: 7 x 3
# Groups:   grp [2]
    grp value  diff
  <dbl> <dbl> <dbl>
1     1    10    NA
2     1    15    10
3     1    20     5
4     1    10    NA
5     2    25    NA
6     2    30    10
7     2    35    NA

它看起来像您要找的东西吗?

数据

df = data.frame(grp = c(rep(1,4),rep(2,3)),
                value = c(10,15,20,10,25,30,35))

推荐阅读