r - 根据其他列值选择时间序列中的特定时间
问题描述
我已经在一个时间序列中连续测量了罐子的土壤排放量,现在我想删掉一些测量值,这样我就只有最后一次测量时间段了。
我有一个包含三列的数据集time
:jar
和measurement_type
。我想剪掉第一行,这样我就得到了最后 12 分钟测量的罐子。当我测量一个罐子的最后 12 分钟时,我想剪掉最后两分钟测量的时间,所以我最终对每个罐子进行了 10 分钟的测量。
有什么想法可以顺利解决这个问题吗?
这是数据:
df <- structure(list(time = c("2021-04-02 23:40:20", "2021-04-02 23:41:15",
"2021-04-02 23:42:10", "2021-04-02 23:43:05", "2021-04-02 23:44:55",
"2021-04-02 23:45:50", "2021-04-02 23:46:45", "2021-04-02 23:47:40",
"2021-04-02 23:48:35", "2021-04-02 23:49:30", "2021-04-02 23:50:25",
"2021-04-02 23:52:15", "2021-04-02 23:53:10", "2021-04-02 23:54:05",
"2021-04-02 23:55:00", "2021-04-02 23:55:55", "2021-04-02 23:56:50",
"2021-04-02 23:57:45", "2021-04-02 23:58:40", "2021-04-03 00:00:30",
"2021-04-03 00:01:25", "2021-04-03 00:02:20", "2021-04-03 00:03:15",
"2021-04-03 00:04:10", "2021-04-03 00:06:00", "2021-04-03 00:06:55",
"2021-04-03 00:07:50", "2021-04-03 00:08:45", "2021-04-03 00:09:40",
"2021-04-03 00:10:35", "2021-04-03 00:11:30", "2021-04-03 00:13:20",
"2021-04-03 00:14:15", "2021-04-03 00:15:10", "2021-04-03 00:16:05",
"2021-04-03 00:17:55", "2021-04-03 00:18:50", "2021-04-03 00:19:45",
"2021-04-03 00:20:40", "2021-04-03 00:22:30", "2021-04-03 00:23:25",
"2021-04-03 00:24:20", "2021-04-03 00:26:10", "2021-04-03 00:27:05",
"2021-04-03 00:28:00", "2021-04-03 00:28:55", "2021-04-03 00:30:45",
"2021-04-03 00:31:40", "2021-04-03 00:32:35", "2021-04-03 00:34:25",
"2021-04-03 00:35:20", "2021-04-03 00:36:15", "2021-04-03 00:37:10",
"2021-04-03 00:39:00", "2021-04-03 00:39:55", "2021-04-03 00:40:50",
"2021-04-03 00:41:45", "2021-04-03 00:43:35", "2021-04-03 00:44:30",
"2021-04-03 00:45:25", "2021-04-03 00:47:15", "2021-04-03 00:48:10",
"2021-04-03 00:49:05", "2021-04-03 00:50:00", "2021-04-03 00:50:55",
"2021-04-03 00:51:50", "2021-04-03 00:52:45", "2021-04-03 00:53:40",
"2021-04-03 00:55:30", "2021-04-03 00:56:25", "2021-04-03 00:57:20",
"2021-04-03 00:58:15", "2021-04-03 00:59:10", "2021-04-03 01:00:05",
"2021-04-03 01:01:00", "2021-04-03 01:02:50", "2021-04-03 01:03:45",
"2021-04-03 01:04:40", "2021-04-03 01:05:35", "2021-04-03 01:06:30",
"2021-04-03 01:07:25", "2021-04-03 01:08:20", "2021-04-03 01:09:15",
"2021-04-03 01:10:10", "2021-04-03 01:11:05", "2021-04-03 01:11:59",
"2021-04-03 01:13:49", "2021-04-03 01:14:44", "2021-04-03 01:15:39",
"2021-04-03 01:17:29", "2021-04-03 01:18:24", "2021-04-03 01:19:19",
"2021-04-03 01:20:14", "2021-04-03 01:21:09", "2021-04-03 01:22:04",
"2021-04-03 01:22:59", "2021-04-03 01:23:54", "2021-04-03 01:24:49",
"2021-04-03 01:25:44", "2021-04-03 01:26:39", "2021-04-03 01:28:29",
"2021-04-03 01:29:24", "2021-04-03 01:30:19", "2021-04-03 01:31:14",
"2021-04-03 01:32:09", "2021-04-03 01:33:04", "2021-04-03 01:33:59",
"2021-04-03 01:35:49", "2021-04-03 01:36:44", "2021-04-03 01:37:39",
"2021-04-03 01:38:34", "2021-04-03 01:39:29", "2021-04-03 01:40:24",
"2021-04-03 01:42:14", "2021-04-03 01:43:09", "2021-04-03 01:44:04",
"2021-04-03 01:44:59", "2021-04-03 01:45:54", "2021-04-03 01:46:49",
"2021-04-03 01:47:44", "2021-04-03 01:48:39", "2021-04-03 01:49:34",
"2021-04-03 01:50:29", "2021-04-03 01:51:24", "2021-04-03 01:52:19",
"2021-04-03 01:53:14", "2021-04-03 01:54:09", "2021-04-03 01:55:59",
"2021-04-03 01:56:54", "2021-04-03 01:57:49", "2021-04-03 01:58:44",
"2021-04-03 01:59:39", "2021-04-03 02:00:34", "2021-04-03 02:01:29",
"2021-04-03 02:03:19", "2021-04-03 02:04:14", "2021-04-03 02:05:09",
"2021-04-03 02:06:04", "2021-04-03 02:06:59", "2021-04-03 02:07:54",
"2021-04-03 02:08:49", "2021-04-03 02:09:44", "2021-04-03 02:11:34",
"2021-04-03 02:12:29", "2021-04-03 02:13:24", "2021-04-03 02:14:19",
"2021-04-03 02:15:14", "2021-04-03 02:16:09", "2021-04-03 02:17:04",
"2021-04-03 02:18:54", "2021-04-03 02:19:49", "2021-04-03 02:20:44",
"2021-04-03 02:21:39", "2021-04-03 02:22:34", "2021-04-03 02:23:29",
"2021-04-03 02:24:24", "2021-04-03 02:25:19", "2021-04-03 02:26:14",
"2021-04-03 02:27:09", "2021-04-03 02:28:04", "2021-04-03 02:28:59",
"2021-04-03 02:29:54", "2021-04-03 02:30:49", "2021-04-03 02:31:44",
"2021-04-03 02:32:39", "2021-04-03 02:33:34", "2021-04-03 02:34:29",
"2021-04-03 02:35:24", "2021-04-03 02:36:19", "2021-04-03 02:37:14",
"2021-04-03 02:38:09", "2021-04-03 02:39:04", "2021-04-03 02:39:59",
"2021-04-03 02:40:54", "2021-04-03 02:41:49", "2021-04-03 02:42:44",
"2021-04-03 02:43:39", "2021-04-03 02:44:34", "2021-04-03 02:45:29",
"2021-04-03 02:46:24", "2021-04-03 02:47:19", "2021-04-03 02:48:14",
"2021-04-03 02:49:09", "2021-04-03 02:50:04", "2021-04-03 02:50:59",
"2021-04-03 02:51:54", "2021-04-03 02:52:49", "2021-04-03 02:53:44",
"2021-04-03 02:54:39", "2021-04-03 02:55:34", "2021-04-03 02:56:29",
"2021-04-03 02:57:24", "2021-04-03 02:58:19", "2021-04-03 02:59:14",
"2021-04-03 03:00:09", "2021-04-03 03:01:04", "2021-04-03 03:01:59",
"2021-04-03 03:02:54", "2021-04-03 03:03:49", "2021-04-03 03:04:44",
"2021-04-03 03:05:39", "2021-04-03 03:06:34", "2021-04-03 03:07:29",
"2021-04-03 03:08:24", "2021-04-03 03:09:19", "2021-04-03 03:10:14",
"2021-04-03 03:11:09", "2021-04-03 03:12:04", "2021-04-03 03:12:59",
"2021-04-03 03:13:54", "2021-04-03 03:14:49", "2021-04-03 03:15:44",
"2021-04-03 03:16:39", "2021-04-03 03:17:34", "2021-04-03 03:18:29",
"2021-04-03 03:19:24", "2021-04-03 03:20:19", "2021-04-03 03:21:14",
"2021-04-03 03:22:09", "2021-04-03 03:23:04", "2021-04-03 03:23:59"
), jar = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L
), measurement_type = c("a", "a", "a", "a", "a", "a", "a", "a",
"a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a",
"a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a",
"a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a",
"a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a",
"a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a",
"a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a",
"a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a",
"a", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b",
"b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b",
"b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b",
"b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b",
"b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b",
"b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b",
"b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b",
"b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b",
"b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b",
"b", "b", "b", "b", "b")), class = "data.frame", row.names = c(NA,
-221L))
这是我期望 jar 1 的样子:
expected_df <- structure(list(time = c("2021-04-02 23:57:45", "2021-04-02 23:58:40",
"2021-04-03 00:00:30", "2021-04-03 00:01:25", "2021-04-03 00:02:20",
"2021-04-03 00:03:15", "2021-04-03 00:04:10", "2021-04-03 00:06:00",
"2021-04-03 00:06:55", "2021-04-03 00:53:40", "2021-04-03 00:55:30",
"2021-04-03 00:56:25", "2021-04-03 00:57:20", "2021-04-03 00:58:15",
"2021-04-03 00:59:10", "2021-04-03 01:00:05", "2021-04-03 01:01:00",
"2021-04-03 01:02:50"), jar = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), measurement_type = c("a",
"a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a",
"a", "a", "a", "a")), class = "data.frame", row.names = c(NA,
-18L))
解决方案
看起来您只使用时间,而忽略了日期。如果是这样,您可以使用ITime
from data.table
(或者您可以将日期设置为同一日期),以便比较时间。
在这种情况下,如果要处理每个jar
和measurement_type
组合,请使用group_by
,然后filter
基于时间。由于时间以秒为单位,我乘以 60 分钟,参考最新(或max
)时间。
最后一个filter
被添加到只是为了重现与帖子中相同的结果,并且可以删除。
library(tidyverse)
library(data.table)
df %>%
mutate(time_only = as.ITime(time)) %>%
group_by(jar, measurement_type) %>%
filter(time_only > max(time_only) - (60 * 12),
time_only < max(time_only) - (60 * 2)) %>%
filter(jar == 1, measurement_type == "a")
输出
time jar measurement_type time_only
<chr> <int> <chr> <ITime>
1 2021-06-06 12:35:45 1 a 12:35:45
2 2021-06-15 12:36:30 1 a 12:36:30
3 2021-06-18 12:36:45 1 a 12:36:45
4 2021-06-24 12:37:15 1 a 12:37:15
5 2021-06-26 12:37:25 1 a 12:37:25
6 2021-06-29 12:37:40 1 a 12:37:40
7 2021-07-06 12:38:15 1 a 12:38:15
8 2021-07-12 12:38:45 1 a 12:38:45
9 2021-07-21 12:39:30 1 a 12:39:30
10 2021-08-05 12:40:45 1 a 12:40:45
11 2021-08-29 12:42:45 1 a 12:42:45
12 2021-09-08 12:43:35 1 a 12:43:35
13 2021-09-14 12:44:05 1 a 12:44:05
14 2021-09-22 12:44:45 1 a 12:44:45