首页 > 解决方案 > R中有没有办法对具有不同缺失观察模式的列进行求和?

问题描述

我有一些我想加在一起的变量,但其中一些变量缺少观察结果,当加在一起时,它会使整行缺少一个或多个缺失。例如,假设我有以下最后一列作为我的期望

df <- matrix(c(23,  NA, 56, NA, NA, 43, 67, NA, 11, 10, 18, 39), byrow = T, nrow = 3)
colnames(df)<- c("X",   "y",    "z",    "sum")
df
      X  y  z sum
[1,] 23 NA 56  NA
[2,] NA 43 67  NA
[3,] 11 10 18  39

Here is my expectation

df2 <- matrix(c(23, NA, 56, 79,
                 NA,    43, 67, 110,
                 11,    10, 18, 39), byrow = T, nrow = 3)

 colnames(df2)<- c("X", "Y", "Z", "sum")

 df2
      X  Y  Z sum
[1,] 23 NA 56  79
[2,] NA 43 67 110
[3,] 11 10 18  39

How can I get this result?

I am using R version 3.6 on Window 10.

标签: rsumsummaryrowsum

解决方案


正如本指出的,我认为你想要的只是na.rm = TRUE,所以是这样的:

df <- matrix(c(23,  NA, 56, NA, 43, 67, 11, 10, 18), byrow = T, nrow = 3)
colnames(df)<- c("X",   "y",    "z")
cbind(df, summ = rowSums(df, na.rm = TRUE))
#       X  y  z summ
# [1,] 23 NA 56   79
# [2,] NA 43 67  110
# [3,] 11 10 18   39

或者,如果您正在使用数据框,则类似这样

    library(dplyr)
    df_frame <- data.frame(df)
    df_frame <- df_frame %>%
      mutate(summ = rowSums(., na.rm = TRUE))
    df_frame
    #    X  y  z summ
    # 1 23 NA 56   79
    # 2 NA 43 67  110
    # 3 11 10 18   39




#OR this if you just want to select numeric variables from the dataframe:

    df_frame <- data.frame(df)
    df_frame <- df_frame %>%
      mutate(summ = rowSums(select_if(., is.numeric), na.rm = TRUE))
    df_frame
    #    X  y  z summ
    # 1 23 NA 56   79
    # 2 NA 43 67  110
    # 3 11 10 18   39

推荐阅读