首页 > 解决方案 > 如何在 R 中连接 3 个具有不完整值的表

问题描述

我想加入df1df2获得df_me。由于无法得到结果,我也尝试将其df_p用作星形方案,但无法得到我想要的结果。

library(tidyverse)

df1 <- tibble(c = c('b','c','d'),
              x = c(1, 2, 3),
              z = c(10, 11, 12))

df2 <- tibble(c = c('a','b','d'),
              y = c(4,5,6),
              z = c(20, 10, 12))

df_p <- tibble(c = c('a','b','c','d'),
               z = c(20, 10, 11, 12))

# This is the result that I want

df_me <- tibble(c = c('a','b','c','d'),
                x = c(NA, 1, 2, 3),
                y = c(4, 5, NA, 6),
                z = c(20, 10, 11, 12))

# This is (part of) what I tried without success

df_left2 <- left_join(df_p, df1, by = 'c')
df_left3 <- left_join(df_p, df2, by = 'c')
df_left4 <- left_join(df_left2, df_left3, by = 'c')

df_left4 %>% arrange(c)
#> # A tibble: 4 x 7
#>   c     z.x.x     x z.y.x z.x.y     y z.y.y
#>   <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 a        20    NA    NA    20     4    20
#> 2 b        10     1    10    10     5    10
#> 3 c        11     2    11    11    NA    NA
#> 4 d        12     3    12    12     6    12
Created on 2021-06-11 by the reprex package (v2.0.0)

标签: rjoin

解决方案


为什么不?

merge(df1, df2, by = c('c', 'z'), all = T)

  c  z  x  y
1 a 20 NA  4
2 b 10  1  5
3 c 11  2 NA
4 d 12  3  6

还是在 dplyr 中?

df1 %>% full_join(df2, by = c('c', 'z'))

推荐阅读