首页 > 解决方案 > 仅提取具有 R 中列值的唯一组合的行

问题描述

我有一个提供路线详细信息的航班数据库,看起来像这样

Ori.  Dest  Carr. Pass Flights
JFK   LAX   Delta 15004 50
JFK   LAX   JetBl 17434 100
JFK   BOS   Delta 15344 89
ATL   FLR   AmerA 25054 90
OHD   LAX   Delta 19876 95
OHD   LAX   AmerA 12344 45

对于输出,我只需要只有 1 个运营商的路线输出应如下所示 -

JFK   BOS   Delta 15344 89
ATL   FLR   AmerA 25054 90

如何在 R 中做到这一点?

标签: r

解决方案


您可以使用 :

library(dplyr)
df %>% group_by(Ori., Dest) %>% filter(n() == 1)

# Ori.  Dest  Carr.  Pass Flights
#  <chr> <chr> <chr> <int>   <int>
#1 JFK   BOS   Delta 15344      89
#2 ATL   FLR   AmerA 25054      90

使用data.table_

library(data.table)
setDT(df)[, .SD[.N == 1], .(Ori., Dest)]

和基础 R :

subset(df, ave(Flights, Ori., Dest, FUN = length) == 1)

数据

df <- structure(list(Ori. = c("JFK", "JFK", "JFK", "ATL", "OHD", "OHD"
), Dest = c("LAX", "LAX", "BOS", "FLR", "LAX", "LAX"), Carr. = c("Delta", 
"JetBl", "Delta", "AmerA", "Delta", "AmerA"), Pass = c(15004L, 
17434L, 15344L, 25054L, 19876L, 12344L), Flights = c(50L, 100L, 
89L, 90L, 95L, 45L)), class = "data.frame", row.names = c(NA, -6L))

推荐阅读