首页 > 解决方案 > 根据 df1 和 df2 之间的匹配将列从 df2 添加到 df1

问题描述

我有两个数据集 df1 和 df2,它们共有一列“ID”和“Country”:

df1 <- data.frame(ID=c(1:20), State=c("NA","NA","NA","NA","NA","NA","NA","NA","NA","NA","CA","IL","SD","NC","SC","WA","CO","AL","AK","HI"))
df2 <- data.frame(ID=c(1,2,3,4,5,"NA","NA","NA","NA","NA"), Year=c("2020","2021","2020","2020","2021","2020","2020","2021","2020","2019"),State=c("NA","NA","NA","NA","NA","CA","SC","NY","NJ","OR"))

如何将 df2 到 df1 的年份添加到 df1 中存在的相同 ID 或 df1 中存在的相同状态?

我之所以要进行此更改:我只需要将这个“年份”信息从 df2 添加到 df1。

标签: r

解决方案


你可以这样做:

df1 <- type.convert(df1)
df2 <- type.convert(df2)

df1 %>%
    left_join(select(df2, -State), 'ID') %>%
    left_join(select(filter(df2, is.na(ID)), -ID), 'State') %>%
    mutate(Year = coalesce(Year.x, Year.y), Year.x = NULL, Year.y = NULL)

   ID State Year
1   1  <NA> 2020
2   2  <NA> 2021
3   3  <NA> 2020
4   4  <NA> 2020
5   5  <NA> 2021
6   6  <NA>   NA
7   7  <NA>   NA
8   8  <NA>   NA
9   9  <NA>   NA
10 10  <NA>   NA
11 11    CA 2020
12 12    IL   NA
13 13    SD   NA
14 14    NC   NA
15 15    SC 2020
16 16    WA   NA
17 17    CO   NA
18 18    AL   NA
19 19    AK   NA
20 20    HI   NA

推荐阅读