首页 > 解决方案 > cbind() on matrices with different lengths

问题描述

I am writing a for loop that converts the output of summary() for a set of vectors into a matrix like foo, below:

            introA   introB  introC  helpA    helpB   helpC
Min.        1        1        4       4       2       4
1st Qu.     5        5        5       5       4       5
Median      5        5        5       5       4       5
Mean        4.83     4.71     4.96    4.89    4.02    4.77
3rd Qu.     5        5        5       5       5       5
Max.        5        5        5       5       5       5
NA's        2        5        0       3       0       2

Note that introC and helpB have zeros in the NA row, and that summary() does NOT produce this by default - if you call summary() on a vector with no NA values, the result is an object with length of 6 instead of 7.

My for loop initializes an empty matrix x, assigns the result of summary for each numeric vector in a dataframe to x, and binds each x to a larger object y. This works on all data frames where every vector has no missing values or all of them do.

When some vectors have missing values and others do not, I've written this work-around:

x <- matrix(NA,nrow=7,ncol=1)
y <- NULL

for(i in 1:ncol(foo)){

  if(length(summary(foo[,i]==6))){

    x <- as.matrix(c(summary(foo[,i]), 0))
    rownames(x) <- c("Min.", "1st Qu.", "Median", "Mean", "3rd Qu.", "Max.", "NA's")

  }else if(length(summary(foo[,i]==7))){

    x <- as.matrix(summary(foo[,i]))
    rownames(x) <- c("Min.", "1st Qu.", "Median", "Mean", "3rd Qu.", "Max.", "NA's")

  }

  y <- cbind(y,x)
  x <- matrix(NA,nrow=7,ncol=1)
}

Here I check to see if the summary() of a vector is length 6 or 7, and I add a row when it's not before binding the results together. Outside of my loop, this works. For some reason I get the following error when I try to run this within the loop:

Error in dimnames(x) <- dn : 
  length of 'dimnames' [1] not equal to array extent

Any idea on how my length could not be equal to the array extent? I have checked the length of summary() for all vectors in foo; all are either length 6 or 7.

标签: rloopsmatrixdata-manipulationsummary

解决方案


First we put the columns of the iris dataset as vectors in the environment, and we put some NAs in one of them :

list2env(iris[1:4],envir = globalenv())
Sepal.Length[1:3] <- NA

Then:

sapply(list(Sepal.Length = Sepal.Length,Sepal.Width = Sepal.Width,Petal.Length = Petal.Length,Petal.Width = Petal.Width),
       function(x) { x <- summary(x); if (is.na(x["NA's"])) x["NA's"] <- 0;x})

#         Sepal.Length Sepal.Width Petal.Length Petal.Width
# Min.        4.300000    2.000000        1.000    0.100000
# 1st Qu.     5.100000    2.800000        1.600    0.300000
# Median      5.800000    3.000000        4.350    1.300000
# Mean        5.862585    3.057333        3.758    1.199333
# 3rd Qu.     6.400000    3.300000        5.100    1.800000
# Max.        7.900000    4.400000        6.900    2.500000
# NA's        3.000000    0.000000        0.000    0.000000

推荐阅读