首页 > 解决方案 > 如果多个组满足某些条件,如何更改后续行值?

问题描述

我有一个看起来像这样的数据框:

ID  value   condition
A   0         0
A   3         0
A   0         1
A   7         1
A   5         0
A   5         0
A   5         0
A   7         0
B   6         0
B   2         1
B   7         0
B   10        1
B   0         0
B   6         0

我想在满足条件时更改 ID 名称,并更改后面的 ID 名称。每个 ID 可以多次满足条件,所以我想每次都修改它。

结果将更改原始 ID 或仅添加一个新列:

ID  value   condition   newID
A   0              0    A
A   3              0    A
A   0              1    A1
A   7              1    A1
A   5              0    A2
A   5              0    A2
A   5              0    A2
A   7              0    A2
B   6              0    B
B   2              1    B1
B   7              0    B2
B   10             1    B3
B   0              0    B4
B   6              0    B4

标签: rdataframe

解决方案


按“ID”分组后的一个选项,使用rleid(from data.table) 创建索引并根据条件将其更改为paste“ID”case_when

library(dplyr)
library(data.table)
df1 %>% 
   group_by(ID) %>%
   mutate(newID = rleid(condition)-1,
          newID = case_when(newID == 0 ~ first(ID), TRUE ~ paste0(first(ID), newID)))
# A tibble: 14 x 4
# Groups:   ID [2]
#   ID    value condition newID
#   <chr> <int>     <int> <chr>
# 1 A         0         0 A    
# 2 A         3         0 A    
# 3 A         0         1 A1   
# 4 A         7         1 A1   
# 5 A         5         0 A2   
# 6 A         5         0 A2   
# 7 A         5         0 A2   
# 8 A         7         0 A2   
# 9 B         6         0 B    
#10 B         2         1 B1   
#11 B         7         0 B2   
#12 B        10         1 B3   
#13 B         0         0 B4   
#14 B         6         0 B4   

数据

df1 <- structure(list(ID = c("A", "A", "A", "A", "A", "A", "A", "A", 
 "B", "B", "B", "B", "B", "B"), value = c(0L, 3L, 0L, 7L, 5L, 
 5L, 5L, 7L, 6L, 2L, 7L, 10L, 0L, 6L), condition = c(0L, 0L, 1L, 
 1L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 0L)), class = "data.frame", 
 row.names = c(NA, -14L))

推荐阅读