首页 > 解决方案 > 我有一个示例数据集,其中缺少值

问题描述

我有一个示例数据集,其中缺少值。我想创建一个新列,其中包含不同组合的消息,它应该告诉哪些列值丢失。

例子:

Dataset:

A B C D

1 2 4

4 4

4 1

3 2 3

上述数据集的排列为:

"a" ,"b","c","d" ,"a, b","a, c" ,"a, d" , "b, c","b, d","c, d" , "a, b, c","a, b, d","a, c, d","b, c, d","a, b, c, d"

结果:

A B C D Message

1 2 4 Column B is missing

4 4 column A and D is Missing

4 1 Column C and D is Missing

All column values are missing

3 2 3 Column B is Missing

任何建议将不胜感激

标签: r

解决方案


这是一种使用apply基础 R 的方法 -

set.seed(4)
df <- data.frame(matrix(sample(c(1:5, NA), 15, replace = T), ncol = 3))
names(df) <- LETTERS[1:3]

df$msg <- apply(df, 1, function(x) {
  if(anyNA(x)) {
    paste0(paste0(names(x)[which(is.na(x))], collapse = " "), " missing", collapse = "")
  } else {
    "No missing"
  }
})

df

  A  B  C             msg
1 4  2  5      No missing
2 1  5  2      No missing
3 2 NA  1       B missing
4 2 NA NA     B C missing
5 5  1  3      No missing

推荐阅读