首页 > 解决方案 > 遍历名称特定列以检查值(下一级)

问题描述

我想实现以下目标:

  1. 循环遍历所有 CHECK 列,有时还有更多(最多 20 个)。数据也是如此(肯定会超过 3 个观察值)。随意使用我的变量 CHECKnum、CHECKstart 或 CHECKend
  2. 检查其中是否有任何以A开头的内容,如果是,则返回列名,否则返回CHECK0
  3. 这以前是通过获得 A001 的完全匹配来实现的,但我需要一个类似 str_detect 的函数

样本数据

mydf <- data.frame(case = c(1, 2, 3),
                   id = c(10, 11, 12),
                   CHECK1 = c("A001", "B001", "C001"),
                   CHECK2 = c("Z001", "B001", "C001"),
                   CHECK3 = c("Z001", "B001", "C001"),
                   CHECK4 = c("Z001", "B001", "A001"),
                   CHECK5 = c("Z001", "B001", "C001"))

审判:

#Select the columns to check
cols <- grep('CHECK', names(mydf), value = TRUE)
#Compare the value
#mat <- mydf[cols] == 'A001'
mat <- str_dect(mydf[cols], 'A')
#Find the column name where the value exist in each row
mydf$result <- max.col(mat)
#If the value does not exist in the row turn to `NA`.
mydf$result[rowSums(mat) == 0] <- NA
mydf

#  case id CHECK1 CHECK2 CHECK3 CHECK4 CHECK5 result
#1    1 10   A001   Z001   Z001   Z001   Z001 1
#2    2 11   B001   B001   B001   B001   B001   <NA>
#3    3 12   C001   C001   C001   A001   C001 4

我希望它显示类似于 在此处输入图像描述的内容

标签: rloops

解决方案


您可以使用sapplywithstartsWith找出以 开头的行'A'

cols <- grep('CHECK', names(mydf), value = TRUE)
#Compare the value
mat <- sapply(mydf[cols], startsWith, 'A')
#Find the column name where the value exist in each row
mydf$result <- cols[max.col(mat)]
#If the value does not exist in the row turn to 'CHECK0'.
mydf$result[rowSums(mat) == 0] <- 'CHECK0'
mydf

#  case id CHECK1 CHECK2 CHECK3 CHECK4 CHECK5 result
#1    1 10   A001   Z001   Z001   Z001   Z001 CHECK1
#2    2 11   B001   B001   B001   B001   B001 CHECK0
#3    3 12   C001   C001   C001   A001   C001 CHECK4

推荐阅读