首页 > 解决方案 > 如何分隔同名下同一个变量的两个值?

问题描述

我有一个这样的数据框:

library(tidyverse)
a <- tibble(x=c("mother","father","brother","brother"),y=c("a","b","c","d"))
b <- tibble(x=c("mother","father","brother","brother"),z=c("e","f","g","h"))

我想加入这些数据框,以便每个“兄弟”只出现一次

我试过完全加入

 ab <- full_join(a,b,by="x")

并获得了这个:

    # A tibble: 6 x 3
  x       y     z    
  <chr>   <chr> <chr>
1 mother  a     e    
2 father  b     f    
3 brother c     g    
4 brother c     h    
5 brother d     g    
6 brother d     h 

我需要的是这个:

ab <- tibble(x=c("mother","father","brother1","brother2"),y=c("a","b","c","d"),z=c("e","f","g","h"))

# A tibble: 4 x 3
  x        y     z    
  <chr>    <chr> <chr>
1 mother   a     e    
2 father   b     f    
3 brother1 c     g    
4 brother2 d     h

标签: rjointibble

解决方案


使用 dplyr 您可以执行以下操作,它添加一个额外的变量person来识别 中每个组中的每个人x,然后通过xand加入person

library(dplyr)

a %>% 
    group_by(x) %>% 
    mutate(person = 1:n()) %>%
    full_join(b %>% 
                  group_by(x) %>%
                  mutate(person = 1:n()),
              by = c("x", "person")
              ) %>% 
    select(x, person, y, z)

返回:

# A tibble: 4 x 4
# Groups:   x [3]
  x       person y     z    
  <chr>    <int> <chr> <chr>
1 mother       1 a     e    
2 father       1 b     f    
3 brother      1 c     g    
4 brother      2 d     h  

推荐阅读