首页 > 解决方案 > 汇总 R 数据框中的列,与顺序无关, (df$A,df$B) = (df$B,df$A)

问题描述

我有以下数据框:

    
      命运起源计数
    1 KJFK SBBR 4
    2 KJFK SAEZ 4683
    3 SBGL KJFK 2 
    4 SBBR KJFK 2 
    5 KJFK SBGL 4987
    6 KJFK SBGR 12911
    ...
    

因为我对这条路线很感兴趣,所以对我来说 KJFK -> SBBR 与 SBBR -> KJFK 相同。所以我想总结他们的数量,如下表

    
      命运起源计数
    1 KJFK SBBR 6
    2 KJFK SAEZ 4683
    3 SBGL KJFK 4989
    4 KJFK SBGR 12911
    ...
    

我不想使用大的 for 循环来评估所有值

标签: rdataframemergesummarization

解决方案


这是一个选项pmin/pmax

library(tidyverse)
df1 %>%       
  group_by(destinyN = pmin(destiny, origin), originN = pmax(destiny, origin)) %>% 
  summarise(destiny = first(destiny), 
            origin = first(origin), 
            Count = sum(Count)) %>%
  ungroup %>%
  select(-destinyN, -originN)
# A tibble: 4 x 3
#  destiny origin Count
#  <chr>   <chr>  <int>
#1 KJFK    SAEZ    4683
#2 KJFK    SBBR       6
#3 SBGL    KJFK    4989
#4 KJFK    SBGR   12911

数据

df1 <- structure(list(destiny = c("KJFK", "KJFK", "SBGL", "SBBR", "KJFK", 
"KJFK"), origin = c("SBBR", "SAEZ", "KJFK", "KJFK", "SBGL", "SBGR"
), Count = c(4L, 4683L, 2L, 2L, 4987L, 12911L)), .Names = c("destiny", 
"origin", "Count"), row.names = c("1", "2", "3", "4", "5", "6"
), class = "data.frame")

推荐阅读