首页 > 解决方案 > 如何将单词分配给数据框中的数字

问题描述

我有一个下面的数据框,其中两列中有数字,我应该使用我的其他参考数据集将其替换为字符串。

数据集 1:

lhs         rhs 
32,39,6     65  
39,6,65     32  
14,16,26    15
16,20,4     26  
16,26,33    4   
53          31  

数据集 2:

id   name
4   yougurt
6   coffee
14  cream chese
15  meat spreads
16  butter
20  whole milk
26  condensed milk
31  curd
32  flour
39  rolls
53  sugar
65  soda

预期输出:

lhs                                     rhs
flour, rolls, coffee                   soda
rolls, coffee, soda                    flour
cream chease, butter, condensed milk   meat spreads

标签: rdataframe

解决方案


使用的解决方案。dat是最终的输出。关键是使用separate_rows展开lhs然后进行left_join两次。

library(dplyr)
library(tidyr)

dat <- dat1 %>%
  separate_rows(lhs, convert = TRUE) %>%
  left_join(dat2, by = c("lhs" = "id")) %>%
  left_join(dat2, by = c("rhs" = "id")) %>%
  drop_na(name.x) %>%
  group_by(name.y) %>%
  summarise(lhs = paste0(name.x, collapse = ", ")) %>%
  ungroup() %>%
  select(lhs, rhs = name.y)

dat
# # A tibble: 6 x 2
#   lhs                                 rhs           
#   <chr>                               <chr>         
# 1 butter, whole milk, yougurt         condensed milk
# 2 sugar                               curd          
# 3 rolls, coffee, soda                 flour         
# 4 cream chese, butter, condensed milk meat spreads  
# 5 flour, rolls, coffee                soda          
# 6 butter, condensed milk              yougurt

数据

dat1 <- read.table(text = "lhs         rhs 
'32,39,6'     65  
'39,6,65'     32  
'14,16,26'    15
'16,20,4'     26  
'16,26,33'    4   
53          31  ",
                   stringsAsFactors = FALSE, header = TRUE)

dat2 <- read.table(text = "id   name
4   yougurt
                   6   coffee
                   14  'cream chese'
                   15  'meat spreads'
                   16  butter
                   20  'whole milk'
                   26  'condensed milk'
                   31  curd
                   32  flour
                   39  rolls
                   53  sugar
                   65  soda",
                   header = TRUE, stringsAsFactors = FALSE)

推荐阅读