首页 > 解决方案 > 连接每组一列的值

问题描述

我想连接现在在一列中的每组值。以下是我要整理的数据框的简短版本。

library(tidyverse)

df <- tibble::tribble(
  ~county,  ~party,
      "A",   "VVD",
      "A",    "GL",
      "A", "Local",
      "B",   "D66",
      "B", "Local"
  )

现在我想在每个县创建一行,所有各方都在他们自己的列中:

df2 <- tibble::tribble(
  ~county, ~party1, ~party2, ~party3,
      "A",   "VVD",    "GL", "Local",
      "B",   "D66", "Local",      NA
  )

稍后连接unite()并替换逗号的下划线并删除 NA。

df2 %>%
  unite(party, c("party1", "party2", "party3")) %>%
  mutate(party = gsub("_NA", "", party),
         party = gsub("_", ", ", party))

我想要的df输出:

  county party         
  <chr>  <chr>         
1 A      VVD, GL, Local
2 B      D66, Local

标签: rtidyr

解决方案


我们可以通过创建一个序列列和spread

library(tidyverse)
df %>%
   group_by(county) %>% 
   mutate(v1 = paste0('party', row_number())) %>% 
   spread(v1, party)
# A tibble: 2 x 4
# Groups:   county [2]
#  county party1 party2 party3
#  <chr>  <chr>  <chr>  <chr> 
#1 A      VVD    GL     Local 
#2 B      D66    Local  <NA>  

对于第二个输出,我们按“县”和paste“党”的元素分组

df %>%
  group_by(county) %>%
  summarise(party = toString(party))
# A tibble: 2 x 2
#  county party         
#  <chr>  <chr>         
#1 A      VVD, GL, Local
#2 B      D66, Local   

推荐阅读