首页 > 解决方案 > 如何逐个元素地组合数据框中的两列?

问题描述

我需要逐个元素地组合数据框中的两列。我尝试使用paste函数,但它基本上连接了列,这不是我需要的:

#sample data
df <- data.frame ("col1" = c("red|",
                             "blue| , red|", 
                             "blue| , red| , yellow|"), 
                  "col2" = c("green",
                             "yellow , blue",
                             "black , red , blue"))

#this is what I tried:
df$new <- paste(df$col1, df$col2, sep = " , ")

#output for each row:
# "red| , green"           
# "blue| , red| , yellow , blue"            
# "blue| , red| , yellow| , black , red , blue"

#below is the desired output:
df$correct_output <- c("red|green",
                       "blue|yellow , red|blue",
                       "blue|black , red|red , yellow|blue")

标签: rdataframe

解决方案


#sample data
df <- data.frame ("col1" = c("red|",
                             "blue| , red|", 
                             "blue| , red| , yellow|"), 
                  "col2" = c("green",
                             "yellow , blue",
                             "black , red , blue"))

library(tidyverse)

df %>%
  group_by(id = row_number()) %>%           # group by a row id (useful to reshape)
  separate_rows(col1, col2, sep=" ,") %>%   # separate based on comma and add new rows
  unite(col, col1, col2, sep="") %>%        # combine corresponding values
  summarise(correct = paste0(gsub(" ", "", col), collapse = ", ")) %>% # remove any spaces and combine values
  bind_cols(df, .) %>%                      # bind origina dataset
  select(-id)                               # remove id column

#                     col1               col2                          correct
# 1                   red|              green                        red|green
# 2           blue| , red|      yellow , blue            blue|yellow, red|blue
# 3 blue| , red| , yellow| black , red , blue blue|black, red|red, yellow|blue

推荐阅读