首页 > 解决方案 > 基于其他列实际值和先前值的值替换

问题描述

当当前 'obs1' 列是 1 并且前一个 'obs1' 列是 0 每个 ID 时,是否有办法设置 cero 列 'result' 中的值,省略循环?

输入数据

df <- data.frame(ID = c(1,1,1,1,1,1,1,1,1,1, 2, 2),
     obs1 = c(1,1,1,1,1,1,0,0,1,1,1,1),
     obs2 = c(1,1,1,0,0,0,1,1,1,0,0,1),
     result1 = c(0,28,63,84,105,135,150,150,150,59, 0,300),
     result2 = c(0,28,63,63,63,63,63,31,59,59,0,0))

期望的输出:

df <- data.frame(ID = c(1,1,1,1,1,1,1,1,1,1,2,2),
     obs1 = c(1,1,1,1,1,1,0,0,1,1,1,1),
     obs2 = c(1,1,1,0,0,0,1,1,1,0,0,1),
     result1 = c(0,28,63,84,105,135,150,150,0,59,0,300),
     result2 = c(0,28,63,63,63,63,0,31,59,59,0,0))

更改发生在第 6 行“result2”列和第 9 行“result1”列

标签: rperformancefor-loopreplace

解决方案


一个选项dplyr可以是:

library(dplyr)
df %>% group_by(ID) %>%
  mutate(result1 = ifelse(obs1==1 & lag(obs1, default = 1) == 0, 0, result1)) %>%
  mutate(result2 = ifelse(obs2==1 & lag(obs2, default = 1) == 0, 0, result2)) %>%
  as.data.frame()

可以使用以下方式实现通用解决方案mutate_at

df %>% group_by(ID) %>%
  mutate_at(vars(starts_with("result")), 
           funs(ifelse( get(sub("result","obs",quo_name(quo(.))))==1 &
                  lag(get(sub("result","obs",quo_name(quo(.)))),
                                             default = 1) ==0  ,0,.)
                                              )) %>%
  as.data.frame()

#    ID obs1 obs2 result1 result2
# 1   1    1    1       0       0
# 2   1    1    1      28      28
# 3   1    1    1      63      63
# 4   1    1    0      84      63
# 5   1    1    0     105      63
# 6   1    1    0     135      63
# 7   1    0    1     150       0
# 8   1    0    1     150      31
# 9   1    1    1       0      59
# 10  1    1    0      59      59
# 11  2    1    0       0       0
# 12  2    1    1     300       0

推荐阅读