首页 > 解决方案 > R:获取值不为空的列名

问题描述

我有一个 7 列的表,第一列是 id,然后是 3 列蔬菜类型,最后 3 列是水果类型。这些值表明一个人是否有这种蔬菜/水果。有没有办法对蔬菜和水果进行分组,如果这个人有蔬菜/水果,则输出列名?

输入数据框:

id1 <- c("id_1", 1, NA, NA, NA, 1, NA)
id2 <- c("id_2", NA, 1, 1, NA, NA, NA)
input <- data.frame(rbind(id1, id2))
colnames(input) = c("id", "lettuce", "tomato", "bellpeper", "pineapple", "apple", "banana")

预期的输出数据帧:

output_id1 <- c("id_1", "lettuce", "apple")
output_id2 <- c("id_2", "tomato, bellpeper", NA)
output <- data.frame(rbind(output_id1, output_id2))
colnames(output) <- c("id", "veg", "fruit")

标签: rdataframetidyverse

解决方案


这应该可以解决问题!

id1 <- c("id_1", 1, NA, NA, NA, 1, NA)
id2 <- c("id_2", NA, 1, 1, 1, NA, NA)
input <- data.frame(rbind(id1, id2))
colnames(input) = c("id", "lettuce", "tomato", "bellpeper", "pineapple", "apple", "banana")

# Remove the id column, it's not necessary
input_without_id <- dplyr::select(input, -c("id"))

# For each row (margin = 1) of the input, return the names vector (names(input))
# but only in the positions the where the row (x!) is not NA
result <- apply(input_without_id, MARGIN = 1, function(x) {
    return(names(input_without_id)[which(!is.na(x))])
})

# Rename the result with the corresponding ids originally found in input.
names(result) <- input$id

推荐阅读