首页 > 解决方案 > 如果所有值都为 0,则用 R 填充缺失

问题描述

我正在尝试用 R 填充缺失值。

如果所有其他值都是 0,那么我想用 0 填充缺失。

一个例子如下所示。在这个数据中,c除列之外的所有值NA都是 0。所以,我想Na用 0 填充。

set.seed(1000)
a<-rnorm(10)
b<-rnorm(10)
c<-rep(0,10)
c[c(2,4,8)]<-NA
test<-cbind(a,b,c)

               a          b  c
 [1,]  0.1901328  0.6141360  0
 [2,] -0.9884426  0.6508993 NA
 [3,] -0.9783197  2.1059862  0
 [4,] -1.8584651  0.4354903 NA
 [5,]  0.6623067  1.6382126  0
 [6,] -1.2542872  0.1370791  0
 [7,] -1.9971880  1.9302738  0
 [8,]  1.9417941  0.0449239 NA
 [9,]  1.7046508  1.0726263  0
[10,] -0.7289351 -2.8374912  0

我找不到一个很好的代码示例。你能给我一个好的建议吗?

标签: rif-statementnadplyr

解决方案


使用setnafillindata.table你可以做两遍——检查哪些列全为 0,然后填充它们:

library(data.table)
test = data.table(test)

# this will warn about converting double->numeric;
#   you may want to suppressWarnings here; more
#   "properly" you would do
#   sapply(test, function(x) any(x != 0, na.rm = TRUE))
empty_cols = !sapply(test, any, na.rm = TRUE)

# use setnafill to do the replacement in-place
setnafill(test, type = 'const', fill = 0, cols = which(empty_cols))
test[]
#               a           b c
#  1: -0.44577826 -0.98242783 0
#  2: -1.20585657 -0.55448870 0
#  3:  0.04112631  0.12138119 0
#  4:  0.63938841 -0.12087232 0
#  5: -0.78655436 -1.33604105 0
#  6: -0.38548930  0.17005748 0
#  7: -0.47586788  0.15507872 0
#  8:  0.71975069  0.02493187 0
#  9: -0.01850562 -2.04658541 0
# 10: -1.37311776  0.21315411 0

推荐阅读