r - Combine rows based on ranges in a column
问题描述
I have a pretty large dataset where I have a column for time in seconds and I want to combine rows where the time is close (range: .1-.2 seconds apart) as a mean.
Here is an example of how the data looks:
BPM seconds
63.9 61.899
63.9 61.902
63.8 61.910
62.1 130.94
62.1 130.95
61.8 211.59
63.8 280.5
60.3 290.4
So I would want to combine the first 3 rows, then the 2 following after that, and the rest would stand alone. Meaning I would want the data to look like this:
BPM seconds
63.9 61.904
62.1 130.95
61.8 211.59
63.8 280.5
60.3 290.4
解决方案
我们需要创建组,这是重要的一点,其余的是标准聚合:
cumsum(!c(0, diff(df1$seconds)) < 0.2)
# [1] 0 0 0 1 1 2 3 4
然后使用聚合聚合:
aggregate(df1[, 2], list(cumsum(!c(0, diff(df1$seconds)) < 0.2)), mean)
# Group.1 x
# 1 0 61.90367
# 2 1 130.94500
# 3 2 211.59000
# 4 3 280.50000
# 5 4 290.40000
或使用dplyr:
library(dplyr)
df1 %>%
group_by(myGroup = cumsum(!c(0, diff(seconds)) < 0.2)) %>%
summarise(BPM = first(BPM),
seconds = mean(seconds))
# # A tibble: 5 x 3
# myGroup BPM seconds
# <int> <dbl> <dbl>
# 1 0 63.9 61.9
# 2 1 62.1 131.
# 3 2 61.8 212.
# 4 3 63.8 280.
# 5 4 60.3 290.
可重现的示例数据:
df1 <- read.table(text = "BPM seconds
63.9 61.899
63.9 61.902
63.8 61.910
62.1 130.94
62.1 130.95
61.8 211.59
63.8 280.5
60.3 290.4", header = TRUE)
推荐阅读
- python - 在 python 中进行拼写检查的时间太多
- delphi - PostgreSQL 的 FireDAC 异常 EFDDBEngineException 错误代码映射是否有任何变化?
- c# - 如果 MSBuild 工具为时过早,如何让 MSBuild .proj 失败
- c++ - 使用 ICP 算法查找两个点云之间的平移
- angular - 如何在 angular-google-charts 中缩放谷歌折线图?
- javascript - 在 Google 我的地图 iframe 上禁用鼠标滚轮滚动
- r - R:使用 terminalExecute() 时出现“调用 capture_console_output 时出错:87”
- java - Centos java定制服务
- entity-framework - 实体框架:映射多对多
- mysql - 具有唯一子集的 MySQL 复合索引