r - Determine if next number in a time series is the max of time series so far (for grouped df)
问题描述
I am looking at time series data and trying to identify historical maximums.
I am trying to do this by iterating over a vector and checking if the value I am looking at is greater than or equal to the max of the data up to this point. I can write a function for this, but I am struggling when I want to apply it to a grouped data frame.
Here is an example:
set.seed(32)
x <- data.frame(time = c(1:6),
value = runif(6))
> x
time value
1 1 0.5058405
2 2 0.5948084
3 3 0.8087471
4 4 0.7288197
5 5 0.1519876
6 6 0.9561873
#write a function to identify the records
#function takes an index
#checks whether the number at that index is greater than or equal to the maximum of the preceding values to that index
max_v <- function(index) {
output <- x$value[index] >= max(x$value[1:index])
output
}
#create the record variable
x$record <- sapply(1:nrow(x), max_v)
> x
time value record
1 1 0.5058405 TRUE
2 2 0.5948084 TRUE
3 3 0.8087471 TRUE
4 4 0.7288197 FALSE
5 5 0.1519876 FALSE
6 6 0.9561873 TRUE
The function works well. However the challenge I am facing is that I want to apply this to a data frame grouped by the type
variable created below:
set.seed(32)
x <- data.frame(time = rep(c(1:6),2),
type = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2),
value = runif(12))
> x
time type value
1 1 1 0.5058405
2 2 1 0.5948084
3 3 1 0.8087471
4 4 1 0.7288197
5 5 1 0.1519876
6 6 1 0.9561873
7 1 2 0.7535377
8 2 2 0.8520623
9 3 2 0.6734418
10 4 2 0.3871255
11 5 2 0.6580025
12 6 2 0.3213696
What I want is:
> x
time type value record
1 1 1 0.5058405 TRUE
2 2 1 0.5948084 TRUE
3 3 1 0.8087471 TRUE
4 4 1 0.7288197 FALSE
5 5 1 0.1519876 FALSE
6 6 1 0.9561873 TRUE
7 1 2 0.7535377 TRUE
8 2 2 0.8520623 TRUE
9 3 2 0.6734418 FALSE
10 4 2 0.3871255 FALSE
11 5 2 0.6580025 FALSE
12 6 2 0.3213696 FALSE
I have tried group_map
and tapply
, but I can't seem to get intelligible results, as I don't know how to pass the vector of indexes that I want to apply/map over.
解决方案
You can compare grouped value against the cumulative max.
x$record <- as.logical(with(x, ave(value, type, FUN = \(v) v == cummax(v))))
x
time type value record
1 1 1 0.5058405 TRUE
2 2 1 0.5948084 TRUE
3 3 1 0.8087471 TRUE
4 4 1 0.7288197 FALSE
5 5 1 0.1519876 FALSE
6 6 1 0.9561873 TRUE
7 1 2 0.7535377 TRUE
8 2 2 0.8520623 TRUE
9 3 2 0.6734418 FALSE
10 4 2 0.3871255 FALSE
11 5 2 0.6580025 FALSE
12 6 2 0.3213696 FALSE
推荐阅读
- javascript - UnhandledPromiseRejectionWarning:TypeError:无法读取未定义的属性“长度”
- java - 由于 GMT 差异,@JsonFormat Jackson 注释错误地输出日期
- pandas - Pandas 样式 - 为特定列的单元格着色而不是整个 DataFrame
- c# - 使用框架 4.7.2 类库制作 ASP.net Core 3.1 Web 应用程序
- ios - 通过 JavaScript 事件观察 WKWebView URL 更改的问题
- c# - 在 EF 核心中,在一个类中添加实体后,如何在不调用 SaveChanges() 的情况下从不同的类访问实体?
- css - 使用 View Style React Native 创建布局
- json.net - 使用 JsonConvert 将具有德国文化格式的字符串转换为双精度
- c++ - 为什么 rand 每次都给我几乎相同(但略有不同)的数字
- java - 消息队列处理,每条消息都有长时间运行的任务