首页 > 解决方案 > 按字符加入 dplyr 中的数据帧

问题描述

所以我有两个数据框:

DF1

X          Y    ID
banana     14   1
orange     20   2
pineapple  1    3
guava      300  4
grapes     1    5

DF2

Store      State   ID
Walmart    NY      1
Sears      AL      1;2
Target     DC      3
Old Navy   PA      3
Popeye's   HA      5
Footlocker NJ      4;5

我加入以下并获得:

df1 %>% 
  inner_join(df2, by = "ID")

X          Y    ID    Store      State
banana     14   1     Walmart    NY
pineapple  1    3     Target     DC
pineapple  1    3     Old Navy   PA
grapes     1    5     Popeye's   HA

但由于分号,我没有在连接中捕获这些数据点,最终结果应该如下所示:

X          Y    ID    Store       State
banana     14   1     Walmart     NY
banana     14   1     Sears       AL
orange     20   2     Sears       AL
pineapple  1    3     Target      DC
pineapple  1    3     Old Navy    PA
guava      300  4     Foot Locker NJ
grapes     1    5     Popeye's    HA
grapes     1    5     Popeye's    HA

标签: rjoindplyrtidyverse

解决方案


separate_rowsfrom tidyr 与 dplyr 结合使用将助您一臂之力。

第一张桌子我叫水果,其他商店。

library(dplyr)
library(tidyr)


fruit %>% 
  inner_join(separate_rows(stores, ID) %>% mutate(ID = as.integer(ID)))

Joining, by = "ID"
          X   Y ID      Store State
1    banana  14  1    Walmart    NY
2    banana  14  1      Sears    AL
3    orange  20  2      Sears    AL
4 pineapple   1  3     Target    DC
5 pineapple   1  3   Old Navy    PA
6     guava 300  4 Footlocker    NJ
7    grapes   1  5   Popeye's    HA
8    grapes   1  5 Footlocker    NJ

推荐阅读