首页 > 解决方案 > 具有多个条件的子集行

问题描述

我想创建一个数据框,通过样本识别信息(包括 Sample_Type 和 Concentration)对每个标记的高度列中具有最大值的行进行子集化。我在下面粘贴了一个示例数据框。此示例中的最终 df 应包含第 2-4 行。

structure(list(Marker = c("A", "A", "B", "B", "B", "B", "C", 
"A", "A", "A"), Height = c(40L, 61L, 38L, 33L, 49L, 114L, 152L, 
108L, 108L, 50L), Sample_Type = c("NTC", "NTC", "NTC", "NTC", 
"NTC", "NTC", "NTC", "CEPH", "CEPH", "CEPH"), Concentration = c(100L, 
100L, 100L, 100L, 100L, 100L, 100L, 100L, 50L, 50L), PCR_Protocol = 
c("Current_PCR", 
"Current_PCR", "Current_PCR", "Current_PCR", "Current_PCR", "Current_PCR", 
"Current_PCR", "Current_PCR", "Current_PCR", "Current_PCR")), class = 
"data.frame", row.names = c(NA, 
-10L))

谢谢!

标签: rsubset

解决方案


使用 dplyr,过滤最大值:

library(dplyr)

df1 %>% 
  group_by(Marker) %>% 
  filter(max(Height) == Height)
# # A tibble: 3 x 6
# # Groups:   Marker [3]
#   Marker  Size Height Sample_Type Concentration PCR_Protocol
#   <chr>  <dbl>  <int> <chr>               <int> <chr>       
# 1 A       79.2     61 NTC                   100 Current_PCR 
# 2 B       84.2     38 NTC                   100 Current_PCR 
# 3 C       99.7     33 NTC                   100 Current_PCR 

推荐阅读