首页 > 解决方案 > 如何删除R中具有NULL值的行

问题描述

下面是示例数据和一种操作。在更大的图片中,我正在阅读一堆按年份划分的 excel 文件,然后只选取列(1000 个中的 14 个)并将它们放入新的数据帧(例如 df1、df2)。从那里,我将这些新数据组合成一个最终数据框。我的问题是如何删除最终数据框中填充有空值的行。我可以过滤,但希望在 R 中简单地删除它们并完成它们。

 testyear <-c(2010,2010,2010,2010,2011,2011,2011,2010)
 teststate<-c("CA", "Co", "NV", "NE", "CA", "CO","NV","NE")
 totalhousehold<-c(251,252,253,"NULL",301,302,303,"NULL")
 marriedhousehold <-c(85,86,87,"NULL",158,159,245,"NULL")


 test1<-data.frame(testyear,teststate,totalhousehold,marriedhousehold)


 testyear<-c(2012,2012,2012,2012)
 teststate<-c("WA","OR","WY","UT")
 totalhousehold<-c(654,650,646,641)
 marriedhousehold<-c(400,399,398,395)

 test2<-data.frame(testyear,teststate,totalhousehold,marriedhousehold)

 test3<-rbind(test1,test2)

标签: rdataframedplyrnull

解决方案


由于这些是character列,我们filter across只能使用character列来返回没有"NULL"元素type的行并使用type.convert

library(dplyr)
test4 <- test3 %>% 
      filter(across(where(is.character), ~ . != "NULL")) %>%
       type.convert(as.is = TRUE)

-输出

> test4
   testyear teststate totalhousehold marriedhousehold
1      2010        CA            251               85
2      2010        Co            252               86
3      2010        NV            253               87
4      2011        CA            301              158
5      2011        CO            302              159
6      2011        NV            303              245
7      2012        WA            654              400
8      2012        OR            650              399
9      2012        WY            646              398
10     2012        UT            641              395
> str(test4)
'data.frame':   10 obs. of  4 variables:
 $ testyear        : int  2010 2010 2010 2011 2011 2011 2012 2012 2012 2012
 $ teststate       : chr  "CA" "Co" "NV" "CA" ...
 $ totalhousehold  : int  251 252 253 301 302 303 654 650 646 641
 $ marriedhousehold: int  85 86 87 158 159 245 400 399 398 395

或 in base R,使用subsetwithrowSums创建一个逻辑表达式

type.convert(subset(test3, !rowSums(test3 == "NULL")), as.is = TRUE)

推荐阅读