首页 > 解决方案 > Using an if statement in apply in R for every value in a data frame

问题描述

I have a data frame that I created using the read_excel function and then duplicated it. I'm going to explain it as if I was using Excel, because it's so easy to do this in Excel. I want to check if each cell in each row within columns 3 to 11 have a zero, and if so, put a zero in columns 12 to 20. If not, keep the original value.

Data2 <- Data1

Data2[,12:20] <- apply(Data1[,3:11],1:2,function(x) {if(x==0) {0})

This is the error message I get:

Warning message: In [<-.data.frame(*tmp*, , 12:20, value = list(0, 0, 0, 0, 0, : provided 450 variables to replace 9 variables

Example:

Data1 <- matrix(data=c(0,1,1,0,3,4,5,6,2,3,0,5,6,5,6,2,6,2,3,4,5,6,5,6),nrow=6,ncol=4)
Data2 <- Data1
Data2[,3:4] <- apply(Data1[,1:2],1:2,function(x) if(x==0) {0})
Data2 <- matrix(Data2,nrow=6,ncol=4)

The result should look like this:

     [,1] [,2] [,3] [,4]
[1,]    0    5    0    3
[2,]    1    6    5    4
[3,]    1    2    6    5
[4,]    0    3    0    6
[5,]    3    0    6    0
[6,]    4    5    2    6

where any zero in columns 1 and 2 become zeros in the appropriate spot in columns 3 and 4.

Instead, I get this:

     [,1] [,2] [,3] [,4]
[1,] 0    5    0    NULL
[2,] 1    6    NULL NULL
[3,] 1    2    NULL NULL
[4,] 0    3    0    NULL
[5,] 3    0    NULL 0   
[6,] 4    5    NULL NULL

Also, I'm still getting the same error message from the original data that had 50+ row and 20 columns shown at the beginning.

标签: r

解决方案


这是一个替代解决方案:

首先,创建一个逻辑矩阵,表示感兴趣的列中哪些元素为 0。

mat <- Data1[,1:2] == 0
mat

      [,1]  [,2]
[1,]  TRUE FALSE
[2,] FALSE FALSE
[3,] FALSE FALSE
[4,]  TRUE FALSE
[5,] FALSE  TRUE
[6,] FALSE FALSE

然后,选择逻辑矩阵具有TRUE值的目标列的元素并将其设置为 0:

Data2[,3:4][mat==TRUE] <- 0
Data2

     [,1] [,2] [,3] [,4]
[1,]    0    5    0    3
[2,]    1    6    5    4
[3,]    1    2    6    5
[4,]    0    3    0    6
[5,]    3    0    6    0
[6,]    4    5    2    6

推荐阅读