首页 > 解决方案 > 将两个数据框与公共列连接在一起

问题描述

我有两个不同的数据框共享公共列,如下所示:

数据框 A

structure(list(Firm = c("Alex", NA, NA), Postal.Code = c("V0N 1B4", 
"V0N 1B4", "V0N 1B4"), sold.month = c(NA_real_, NA_real_, NA_real_
), sold.year = c(NA_real_, NA_real_, NA_real_), sold.qtr = c(NA_real_, 
NA_real_, NA_real_), List.year = c(2018, 2018, 2018), List.Date.Year.quarter = c("2018 Q2", 
"2018 Q2", "2018 Q2")), row.names = c(NA, 3L), class = "data.frame")

  Firm Postal.Code sold.month sold.year sold.qtr List.year List.Date.Year.quarter
1 Alex     V0N 1B4         NA        NA       NA      2018                2018 Q2
2 <NA>     V0N 1B4         NA        NA       NA      2018                2018 Q2
3 <NA>     V0N 1B4         NA        NA       NA      2018                2018 Q2

数据框 B

structure(list(sold.month = c(NA, 1L, 1L), sold.year = c(NA, 
2020L, 2020L), sold.qtr = c(NA, 1L, 1L), List.Date.Year.quarter = structure(c(2019.75, 
2019.75, 2019.75), class = "yearqtr"), List.Date.Year.month = structure(c(2019.75, 
2019.91666666667, 2019.91666666667), class = "yearmon"), Sold.Date.Year.month = structure(c(NA, 
2020, 2020), class = "yearmon")), row.names = c(NA, 3L), class = "data.frame")

  sold.month sold.year sold.qtr List.Date.Year.quarter List.Date.Year.month Sold.Date.Year.month
1         NA        NA       NA                2019 Q4             Oct 2019                 <NA>
2          1      2020        1                2019 Q4             Dec 2019             Jan 2020
3          1      2020        1                2019 Q4             Dec 2019             Jan 2020

我想要做什么: 我只想加入数据框A中存在的数据框B的列。如果B中不存在数据框A的列,则该值应显示为NA。

预期的结果应该是:

 Firm Postal.Code sold.month sold.year sold.qtr List.year List.Date.Year.quarter
1 Alex     V0N 1B4         NA        NA       NA      2018                2018 Q2
2 <NA>     V0N 1B4         NA        NA       NA      2018                2018 Q2
3 <NA>     V0N 1B4         NA        NA       NA      2018                2018 Q2
4  NA         NA           NA        NA       NA       NA                 2019 Q4
5  NA         NA            1       2020       1       NA                 2019 Q4
6  NA         NA            1       2020       1       NA                 2019 Q4

标签: rdataframejoinmerge

解决方案


您可以将列的类型更改为字符,然后用于bind_rows组合它们。

library(dplyr)

A %>% 
  mutate(across(.fns = as.character)) %>%
  bind_rows(B %>%
            select(intersect(names(.), names(A))) %>%
            mutate(across(.fns = as.character))) %>%
  type.convert(as.is = TRUE)

#  Firm Postal.Code sold.month sold.year sold.qtr List.year List.Date.Year.quarter
#1 Alex     V0N 1B4         NA        NA       NA      2018                2018 Q2
#2 <NA>     V0N 1B4         NA        NA       NA      2018                2018 Q2
#3 <NA>     V0N 1B4         NA        NA       NA      2018                2018 Q2
#4 <NA>        <NA>         NA        NA       NA        NA                2019 Q4
#5 <NA>        <NA>          1      2020        1        NA                2019 Q4
#6 <NA>        <NA>          1      2020        1        NA                2019 Q4

推荐阅读