首页 > 解决方案 > R中有没有办法在定义的连续行上创建ifelse?

问题描述

如果我有:

df<-data.frame(group=c(1, 1,1, 1,1, 2, 2, 2, 4,4,4,4), 
              value=c("A","B","C","B","A","A","A","B","D","A","A","B"))

我想为从组内第一行开始的任何“连续 3 个”是否具有某些值做一个 ifelse 语句或等效语句。因此,例如,从第 1 组开始,我想扫描 ABC,然后是 BCB,然后是 CBA,并且可能会在每次扫描中是否显示“C”时创建一个“想要”列。像这样的东西:


  group value want_any_c want_any_b
1      1     A        yes        yes
2      1     B        yes        yes
3      1     C        yes        yes
4      1     B        yes        yes
5      1     A        yes        yes
6      2     A         no        yes
7      2     A         no        yes
8      2     B         no        yes
9      4     D         no        yes
10     4     A         no        yes
11     4     A         no        yes
12     4     B         no        yes

跟进:我还想查看是否每次扫描 3 都包含一个值,从组中的第一行开始,然后是第二组等(即第 1 组扫描 ABC、BCB、CBA、第 2 组扫描 AAB 和第 4 组扫描 DAA,AAB。)(ty akrun):

  group value want_any_c want_any_b want_every_c want_every_b
1      1     A        yes        yes          yes          yes
2      1     B        yes        yes          yes          yes
3      1     C        yes        yes          yes          yes
4      1     B        yes        yes          yes          yes
5      1     A        yes        yes          yes          yes
6      2     A         no        yes           no          yes
7      2     A         no        yes           no          yes
8      2     B         no        yes           no          yes
9      4     D         no        yes           no           no
10     4     A         no        yes           no           no
11     4     A         no        yes           no           no
12     4     B         no        yes           no           no

标签: r

解决方案


我们可以使用any%in%

library(dplyr)
df %>% 
   group_by(group) %>%
   mutate(want_any_c = c('no', 'yes')[('C' %in% value) + 1],
           want_any_b = c('no', 'yes')[('B' %in% value) + 1])
# A tibble: 12 x 4
# Groups:   group [3]
#   group value want_any_c want_any_b
#   <dbl> <fct> <chr>      <chr>     
# 1     1 A     yes        yes       
# 2     1 B     yes        yes       
# 3     1 C     yes        yes       
# 4     1 B     yes        yes       
# 5     1 A     yes        yes       
# 6     2 A     no         yes       
# 7     2 A     no         yes       
# 8     2 B     no         yes       
# 9     4 D     no         yes       
#10     4 A     no         yes       
#11     4 A     no         yes       
#12     4 B     no         yes       

如果是每次扫描 3 个值,则创建另一个组gl

library(zoo)
df %>%
 group_by(group) %>%
  mutate(want_any_c = c('no', 'yes')[('C' %in% value) + 1],
        want_any_b = c('no', 'yes')[('B' %in% value) + 1],
        want_every_c = c('no', 'yes')[(all(rollapply(value, 3,
             FUN = function(x) 'C' %in% x))) + 1],
        want_every_b = c('no', 'yes')[(all(rollapply(value, 3, 
             FUN = function(x) 'B' %in% x))) + 1])
# A tibble: 12 x 6
# Groups:   group [3]
#   group value want_any_c want_any_b want_every_c want_every_b
#   <dbl> <fct> <chr>      <chr>      <chr>        <chr>       
# 1     1 A     yes        yes        yes          yes         
# 2     1 B     yes        yes        yes          yes         
# 3     1 C     yes        yes        yes          yes         
# 4     1 B     yes        yes        yes          yes         
# 5     1 A     yes        yes        yes          yes         
# 6     2 A     no         yes        no           yes         
# 7     2 A     no         yes        no           yes         
# 8     2 B     no         yes        no           yes         
# 9     4 D     no         yes        no           no          
#10     4 A     no         yes        no           no          
#11     4 A     no         yes        no           no          
#12     4 B     no         yes        no           no          

因为它是在多个值上完成的,所以一个函数会更有用

f1 <- function(colNm, val){
          c('no', 'yes')[(val %in% {{colNm}}) + 1]
 }


f2 <- function(colNm, val){
        c('no', 'yes')[(all(rollapply({{colNm}}, 3, 
             FUN = function(x) val %in% x))) + 1]
 }

df %>%
    group_by(group) %>%
    mutate(want_any_c = f1(value, "C"), 
           want_any_b = f1(value, "B"),
           want_every_c = f2(value, "C"),
           want_every_b = f2(value, "B"))

推荐阅读