r - Assign values of a new column based on the frequency of a special pattern in dataframe
问题描述
I would like to create another column of a data frame that groups each member in the first column based on the order.
Here is a reproducible demo:
df1=c("Alex","23","ID #:123", "John","26","ID #:564")
df1=data.frame(df1)
library(dplyr)
library(data.table)
df1 %>% mutate(group= ifelse(df1 %like% "ID #:",1,NA ) )
This was the output from the demo:
df1 group
1 Alex NA
2 23 NA
3 ID #:123 1
4 John NA
5 26 NA
6 ID #:564 1
This is what I want:
df1 group
1 Alex 1
2 23 1
3 ID #:123 1
4 John 2
5 26 2
6 ID #:564 2
So I want to have a group column indicates each member in order.
I appreciate in advance for any reply or thoughts!
解决方案
用 first 转移条件,lag
然后做一个cumsum
:
df1 %>%
mutate(group= cumsum(lag(df1 %like% "ID #:", default = 1)))
# df1 group
#1 Alex 1
#2 23 1
#3 ID #:123 1
#4 John 2
#5 26 2
#6 ID #:564 2
细节:
df1 %>%
mutate(
# calculate the condition
cond = df1 %like% "ID #:",
# shift the condition down and fill the first value with 1
lag_cond = lag(cond, default = 1),
# increase the group when the condition is TRUE (ID encountered)
group= cumsum(lag_cond))
# df1 cond lag_cond group
#1 Alex FALSE TRUE 1
#2 23 FALSE FALSE 1
#3 ID #:123 TRUE FALSE 1
#4 John FALSE TRUE 2
#5 26 FALSE FALSE 2
#6 ID #:564 TRUE FALSE 2
推荐阅读
- mysql - SQL中将INT类型的YYYYMMDDHHMMSS改为VARCHAR的YYYY-MM-DD
- excel - VBA 和目录有什么问题?
- php - PHP mail() 函数没有做它应该做的事情
- c# - 分布式事务完成。在新事务或 NULL 事务中登记此会话。在第一个 Scope.complete() 之后
- javascript - id 未在联系人列表中分配
- arrays - 访问 JSON 字符串数组
- javascript - 如何进行递归承诺调用?
- r - 使用 R 生成分类散点图
- caddy - 球童:动态重定向 www 到非 www 反之亦然
- python-3.x - Discord Music bot VoiceClient' 对象没有属性 'create_ytdl_player'