r - 有没有办法检查数据框之间的匹配变量名称?
问题描述
刚开始使用 R,如果我的代码/问题有点迟钝,请原谅我。
我创建了一个变量名称列表。我想检查此列表中的变量是否存在于另一个数据帧(lfs_ja20)中。我应该使用 for 循环还是 lapply 函数?我什至是否适当地创建了列表?
#create list of relevant variables
varlist <- tibble::lst("ACTHR","ADDJOB","AGE","AGES","AGWRK","AXFA","AXPA","BACTHR",
"BANDG","BEFOR","BUSHR","CONMON","COUNTRY","CURED8","DIFJOB",
"DISEA","DURUN")
#my failed attempt at a loop, which results in no error message, but yields nothing
for(i in varlist) {
find_var(lfs_ja20,i)
}
#my failed attempt at creating a search function then using lapply, which yields the following error
message : Error in FUN(X[[i]], ...) : argument "y" is missing, with no default
searchvar <- function(x,y) names(y[grep(x, names(y))])
lapply(varlist, searchvar)
谢谢你。如果这个问题的任何部分不清楚,一些关于其框架的快速建议也会有所帮助。
解决方案
如果您只需要检查列名是否与列表中的元素匹配,则可以使用 %in% 操作
在这里,我复制了您的示例,我只是使用了 base 和 data.frame 中的列表。我自己创建了示例数据框,因为它没有被提供
varlist <- list("ACTHR","ADDJOB","AGE","AGES","AGWRK","AXFA","AXPA","BACTHR",
"BANDG","BEFOR","BUSHR","CONMON","COUNTRY","CURED8","DIFJOB",
"DISEA","DURUN")
dfX <- data.frame(CURED8 = c(1,2), CONMON =c('a','b'), DURUN = c('g','t'))
然后匹配给出预期的结果
varlist %in% names(dfX)
# names() function gets column names of dataframe
输出是
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE TRUE FALSE FALSE
[17] TRUE