首页 > 解决方案 > 替换数据框中的值 - 为什么它不起作用?

问题描述

for(i in 1:nrow(survey)){
    for(j in 1:ncol(survey)){
        if(survey[i,j] < 1 | survey[i,j] >10){
            survey[i,j] <- "NA"
        }
    }
}

我有这个数据集:

survey <- data.frame("q1" = c(5, 3, 2, 7, 11, 5),
                     "q2" = c(4, 2, 2, 5, 5, 2),
                     "q3" = c(2, 1, 4, 2, 9, 10),
                     "q4" = c(2, 5, 2, 5, 4, 2),
                     "q5" = c(1, 4, -20, 2, 11, 2))

我想替换所有小于1和大于10的数字,所以我写了上面的r代码。我在运行代码时得到了这个:

q1 q2 q3 q4 q5
1  5  4  2  2  1
2  3  2  1  5  4
3  2  2  4  2 NA
4  7  5  2  5 NA
5 NA  5  9  4 NA
6 NA  2 10  2 NA

为什么它不起作用?我在代码中错过了什么?有人可以给我建议吗?

标签: rreplacena

解决方案


Another way would be to use the set function of is.na i.e. is.na<-

is.na(survey) <- survey > 10|survey < 1
survey
  q1 q2 q3 q4 q5
1  5  4  2  2  1
2  3  2  1  5  4
3  2  2  4  2 NA
4  7  5  2  5  2
5 NA  5  9  4 NA
6  5  2 10  2  2

The main issue as mentioned in the comments is "NA" is character string. If we do an assignment to an element in a column that is already a numeric one with character, the whole column is converted to character. Instead, use NA

for(i in 1:nrow(survey)){
     for(j in 1:ncol(survey)){
         if(survey[i,j] < 1 | survey[i,j] >10){
             survey[i,j] <- NA
         }
     }
 }
 
survey
  q1 q2 q3 q4 q5
1  5  4  2  2  1
2  3  2  1  5  4
3  2  2  4  2 NA
4  7  5  2  5  2
5 NA  5  9  4 NA
6  5  2 10  2  2

The comparison operators works differently for character values i.e.

> "10" > "1"
[1] TRUE
> "10" > "5"
[1] FALSE

推荐阅读