首页 > 解决方案 > 在R中从下到上提取相似的值

问题描述

我有一张如下表:

FDate<-data.table("Date"=seq(1:6),"Cycle"=c(90,100,130,150,170,200),"i.Task"=c(NA,NA,"D",NA,NA,"A"),"Task"=c("D","A","C","B",NA,NA))
   Date Cycle i.Task Task
1:    1    90   <NA>    D
2:    2   100   <NA>    A
3:    3   130      D    C
4:    4   150   <NA>    B
5:    5   170   <NA> <NA>
6:    6   200      A <NA>

如何提取Task对应的和最大的cycle?输出看起来像这样

  Cycle Task
1   130    C
2   130    D
3   150    B
4   200    A

标签: rduplicates

解决方案


我们可以melt变成'long'格式,然后按'Task'分组,得到max'Cycle'的值

library(data.table)
melt(FDate, id.var = c("Date", "Cycle"), na.rm = TRUE, value.name = "Task")[, 
     .(Cycle = Cycle[which.max(Cycle)]), Task]

或类似的选项gatherfromtidyverse

library(tidyverse)
gather(FDate, key, Task, matches("Task"), na.rm = TRUE) %>% 
    group_by(Task) %>%
    summarise(Cycle = max(Cycle)) %>%
    select(Cycle, Task) %>%
    arrange(Cycle)
# A tibble: 4 x 2
#  Cycle Task 
#  <dbl> <chr>
#1   130 C    
#2   130 D    
#3   150 B    
#4   200 A    

推荐阅读