首页 > 解决方案 > 有没有办法检查数据框之间的匹配变量名称?

问题描述

刚开始使用 R,如果我的代码/问题有点迟钝,请原谅我。

我创建了一个变量名称列表。我想检查此列表中的变量是否存在于另一个数据帧(lfs_ja20)中。我应该使用 for 循环还是 lapply 函数?我什至是否适当地创建了列表?

#create list of relevant variables

varlist <- tibble::lst("ACTHR","ADDJOB","AGE","AGES","AGWRK","AXFA","AXPA","BACTHR",
            "BANDG","BEFOR","BUSHR","CONMON","COUNTRY","CURED8","DIFJOB",
            "DISEA","DURUN")

#my failed attempt at a loop, which results in no error message, but yields nothing

for(i in varlist) {
  find_var(lfs_ja20,i)
}

#my failed attempt at creating a search function then using lapply, which yields the following error 
message : Error in FUN(X[[i]], ...) : argument "y" is missing, with no default

searchvar <- function(x,y) names(y[grep(x, names(y))])
lapply(varlist, searchvar)

谢谢你。如果这个问题的任何部分不清楚,一些关于其框架的快速建议也会有所帮助。

标签: rlistloopssearchlapply

解决方案


如果您只需要检查列名是否与列表中的元素匹配,则可以使用 %in% 操作

在这里,我复制了您的示例,我只是使用了 base 和 data.frame 中的列表。我自己创建了示例数据框,因为它没有被提供

varlist <- list("ACTHR","ADDJOB","AGE","AGES","AGWRK","AXFA","AXPA","BACTHR",
                "BANDG","BEFOR","BUSHR","CONMON","COUNTRY","CURED8","DIFJOB",
                "DISEA","DURUN")
dfX <- data.frame(CURED8 = c(1,2), CONMON =c('a','b'), DURUN = c('g','t'))

然后匹配给出预期的结果

varlist %in% names(dfX)
# names() function gets column names of dataframe

输出是

 [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE  TRUE FALSE FALSE
[17]  TRUE

推荐阅读