首页 > 解决方案 > 对列中的值进行索引

问题描述

我有一个数据框:

dput(df)
structure(list(ID = c("A1", "A1", "A1", "A1", "A1", "A1", "B2",
"B2", "B2", "B2", "B2", "B2", "B2", "B2", "B2", "B2"), operation = c("open",
"open", "close", "", "open", "close", "", "open", "open", "open",
"close", "upload", "open", "close", "open", "close")), class = "data.frame", row.names = c(NA,
-16L))
ID      operation
A1       open
A1       open
A1       close
A1       
A1       open
A1       close
B2      
B2       open
B2       open
B2       open
B2       close
B2       upload
B2       open
B2       close
B2       open
B2       close

我想在列操作中为每个“打开”和“关闭”捆绑添加索引。因此,对于打开和关闭之间的每一行,必须具有相同的索引。所以想要的结果是:

ID      operation    index
A1       open         1
A1       open         1
A1       close        1
A1       
A1       open         2
A1       close        2
B2      
B2       open         3
B2       open         3
B2       open         3
B2       close        3
B2       upload
B2       open         4
B2       close        4
B2       open         5
B2       close        5

我这样做:

dt[, index := .GRP, by = .(rev(cumsum(rev(operation) == 'close')))]
dt[, index := ifelse(cumsum(operation == 'open') > 0, index, NA), by = .(ID, index)]

但是我希望有两个“关闭”选项。它可以是“关闭”,也可以是“检查”:

ID      operation
A1       open
A1       open
A1       checking
A1       
A1       open
A1       close
B2      
B2       open
B2       open
B2       open
B2       close
B2       upload
B2       open
B2       close
B2       open
B2       close

我想得到:

ID      operation    index
A1       open         1
A1       open         1
A1       checking     1
A1       
A1       open         2
A1       close        2
B2      
B2       open         3
B2       open         3
B2       open         3
B2       close        3
B2       upload
B2       open         4
B2       close        4
B2       open         5
B2       close        5

我怎么能添加这个或选项?

标签: rdataframeindexingdata.table

解决方案


您可以%in%用来检查多个值。

library(data.table)
setDT(dt)
dt[, index := .GRP, by = .(rev(cumsum(rev(operation) %in% c('close', 'checking'))))]
dt[, index := ifelse(cumsum(operation == 'open') > 0, index, NA), by = .(ID, index)]

推荐阅读