r - 有没有办法根据列中的因子循环数据并加起来行数?
问题描述
我有一些数据,其中我对同一事件有多次观察。基于时间阈值,我想浓缩观察结果。但我想知道我正在浓缩多少(即有多少观察成为一个)。我不确定如何以这种方式遍历我的数据框。
我试过写一个 for 循环、if 语句、while 语句,并且在谷歌和堆栈溢出上不知疲倦地搜索。似乎与我需要做的事情无关。
这是我的数据的一个子集:
structure(list(date.time = structure(c(1465877617, 1465877774,
1465877816, 1465877844, 1465912214, 1465912806, 1465912862, 1465914033
), class = c("POSIXct", "POSIXt"), tzone = "America/New_York"),
time = structure(1:8, .Label = c("00:13:37", "00:16:14",
"00:16:56", "00:17:24", "09:50:14", "10:00:06", "10:01:02",
"10:20:33"), class = "factor"), X = c(1, 1, 1, 1, 1, 1, 1,
1), diff_time1 = structure(c(157, 42, 28, 34370, 592, 56,
1171, 2820), class = "difftime", units = "secs"), diff_time2 = c(FALSE,
FALSE, FALSE, TRUE, FALSE, FALSE, TRUE, TRUE), new = c("start",
"include", "include", "end", "start", "include", "end", "start-end"
)), row.names = c(NA, 8L), class = "data.frame")
目标是让它看起来像下面,但每个“smushed”观察都有一个额外的样本大小列:
structure(list(n = 1:8, end = structure(c(1465877844, 1465912862,
1465914033, 1465916853, 1465921999, 1465928992, 1465933159, 1465937668
), class = c("POSIXct", "POSIXt")), start = structure(c(1465877617,
1465912214, 1465914033, 1465916853, 1465921999, 1465928647, 1465932867,
1465937418), class = c("POSIXct", "POSIXt")), date = structure(c(16966,
16966, 16966, 16966, 16966, 16966, 16966, 16966), class = "Date")), row.names = c(NA,
-8L), class = c("tbl_df", "tbl", "data.frame"))
解决方案
library(dplyr); library(lubridate)
df %>%
mutate(time_since_last = (date.time - lag(date.time, default = first(date.time))) / dminutes(1)) %>%
mutate(group = 1 + cumsum(time_since_last > 15)) %>% # How many times was there a 15min+ gap? Each new one increments "group"
group_by(group) %>%
summarize(first = min(date.time), # or first(date.time) if sorted
last = max(date.time), # or last(date.time) if sorted
count = n())
## A tibble: 3 x 4
# group first last count
# <dbl> <dttm> <dttm> <int>
#1 1 2016-06-14 00:13:37 2016-06-14 00:17:24 4
#2 2 2016-06-14 09:50:14 2016-06-14 10:01:02 3
#3 3 2016-06-14 10:20:33 2016-06-14 10:20:33 1
推荐阅读
- php - 可以使用 PHP 的 mail() 函数发送 100 万多封邮件吗?
- xamarin - 主详细信息页面内的导航页面
- javascript - 自定义登录表单按钮没有响应
- python - 如何在 Windows 上的 PyCharm 中从 python3-apt 安装 Python 包?
- java - 将 SQL 转换为 JPQL:INNER JOIN 与 SELECT
- audio - 从 L16 创建 mulaw 音频文件
- r - 如果另一个变量中的值在 dplyr 的列表中没有匹配项,则删除一个变量中的值
- javascript - React Native - 如何永久禁止用户
- javascript - 如何将数据绑定到多选?
- node.js - express-validator 并在错误后填充输入字段