首页 > 解决方案 > 按所有行将数据框列表合并到单个数据框

问题描述

我有一个数据框列表,如下所示:

df1

col1   col2
house.  10
cat.    5
dog     7
mouse   4

df2

    col1   col2
    house.   6
    apple.   4
    dog      8
    elephant 3

df3

    col1   col2
    horse    1
    banana   1
    dog      8

所需的输出将是:

          df1.  df2.  df3
house.     10     6.    NA
cat        5      NA.   NA
dog        7      8     8
mouse.     4.     NA.   NA
apple.     NA     4.    NA
elephant.  NA     3.    NA
horse.     NA.    NA.   1
banana.    NA.    NA.   1  

有什么建议吗?

我尝试执行以下操作:

list_df<-list(df1,df2,df3)
df_all<-do.call("rbind", list_df)
df_merge<-as.data.frame(unique(df_all$col1))
colnames(df_merge)<-"category"

df_merge$df1 <- with(df_merge, ifelse (category %in% df1$col1,df1$col2,NA))

但是,当我添加第二个数据框时,出现此错误:$ operator is invalid for atomic vectors

标签: r

解决方案


使用dplyr

library(dplyr)
df <- dplyr::full_join(df1, df2, by = "col1")
df <- dplyr::full_join(df, df3, by = "col1")

df %>% 
  column_to_rownames(var = "col1")

#          col2.x col2.y col2
#house.       10      6   NA
#cat.          5     NA   NA
#dog           7      8    8
#mouse         4     NA   NA
#apple.       NA      4   NA
#elephant     NA      3   NA
#horse        NA     NA    1
#banana       NA     NA    1

更新:如果你有很多数据框。您可以使用reduce来自purrr

library(tidyverse)
list(df1, df2, df3) %>% reduce(full_join, by = "col1") ## this would help

数据

df1 <- structure(list(col1 = structure(c(3L, 1L, 2L, 4L), .Label = c("cat.", "dog", "house.", "mouse"), class = "factor"), col2 = c(10L, 5L, 7L, 4L)), class = "data.frame", row.names = c(NA, -4L))
df2 <- structure(list(col1 = structure(c(4L, 1L, 2L, 3L), .Label = c("apple.", "dog", "elephant", "house."), class = "factor"), col2 = c(6L, 4L, 8L, 3L)), class = "data.frame", row.names = c(NA, -4L))
df3 <- structure(list(col1 = structure(c(3L, 1L, 2L), .Label = c("banana", "dog", "horse"), class = "factor"), col2 = c(1L, 1L, 8L)), class = "data.frame", row.names = c(NA, -3L))

推荐阅读