首页 > 解决方案 > 根据其他列值选择时间序列中的特定时间

问题描述

我已经在一个时间序列中连续测量了罐子的土壤排放量,现在我想删掉一些测量值,这样我就只有最后一次测量时间段了。

我有一个包含三列的数据集timejarmeasurement_type。我想剪掉第一行,这样我就得到了最后 12 分钟测量的罐子。当我测量一个罐子的最后 12 分钟时,我想剪掉最后两分钟测量的时间,所以我最终对每个罐子进行了 10 分钟的测量。

有什么想法可以顺利解决这个问题吗?

这是数据:

df <- structure(list(time = c("2021-04-02 23:40:20", "2021-04-02 23:41:15", 
"2021-04-02 23:42:10", "2021-04-02 23:43:05", "2021-04-02 23:44:55", 
"2021-04-02 23:45:50", "2021-04-02 23:46:45", "2021-04-02 23:47:40", 
"2021-04-02 23:48:35", "2021-04-02 23:49:30", "2021-04-02 23:50:25", 
"2021-04-02 23:52:15", "2021-04-02 23:53:10", "2021-04-02 23:54:05", 
"2021-04-02 23:55:00", "2021-04-02 23:55:55", "2021-04-02 23:56:50", 
"2021-04-02 23:57:45", "2021-04-02 23:58:40", "2021-04-03 00:00:30", 
"2021-04-03 00:01:25", "2021-04-03 00:02:20", "2021-04-03 00:03:15", 
"2021-04-03 00:04:10", "2021-04-03 00:06:00", "2021-04-03 00:06:55", 
"2021-04-03 00:07:50", "2021-04-03 00:08:45", "2021-04-03 00:09:40", 
"2021-04-03 00:10:35", "2021-04-03 00:11:30", "2021-04-03 00:13:20", 
"2021-04-03 00:14:15", "2021-04-03 00:15:10", "2021-04-03 00:16:05", 
"2021-04-03 00:17:55", "2021-04-03 00:18:50", "2021-04-03 00:19:45", 
"2021-04-03 00:20:40", "2021-04-03 00:22:30", "2021-04-03 00:23:25", 
"2021-04-03 00:24:20", "2021-04-03 00:26:10", "2021-04-03 00:27:05", 
"2021-04-03 00:28:00", "2021-04-03 00:28:55", "2021-04-03 00:30:45", 
"2021-04-03 00:31:40", "2021-04-03 00:32:35", "2021-04-03 00:34:25", 
"2021-04-03 00:35:20", "2021-04-03 00:36:15", "2021-04-03 00:37:10", 
"2021-04-03 00:39:00", "2021-04-03 00:39:55", "2021-04-03 00:40:50", 
"2021-04-03 00:41:45", "2021-04-03 00:43:35", "2021-04-03 00:44:30", 
"2021-04-03 00:45:25", "2021-04-03 00:47:15", "2021-04-03 00:48:10", 
"2021-04-03 00:49:05", "2021-04-03 00:50:00", "2021-04-03 00:50:55", 
"2021-04-03 00:51:50", "2021-04-03 00:52:45", "2021-04-03 00:53:40", 
"2021-04-03 00:55:30", "2021-04-03 00:56:25", "2021-04-03 00:57:20", 
"2021-04-03 00:58:15", "2021-04-03 00:59:10", "2021-04-03 01:00:05", 
"2021-04-03 01:01:00", "2021-04-03 01:02:50", "2021-04-03 01:03:45", 
"2021-04-03 01:04:40", "2021-04-03 01:05:35", "2021-04-03 01:06:30", 
"2021-04-03 01:07:25", "2021-04-03 01:08:20", "2021-04-03 01:09:15", 
"2021-04-03 01:10:10", "2021-04-03 01:11:05", "2021-04-03 01:11:59", 
"2021-04-03 01:13:49", "2021-04-03 01:14:44", "2021-04-03 01:15:39", 
"2021-04-03 01:17:29", "2021-04-03 01:18:24", "2021-04-03 01:19:19", 
"2021-04-03 01:20:14", "2021-04-03 01:21:09", "2021-04-03 01:22:04", 
"2021-04-03 01:22:59", "2021-04-03 01:23:54", "2021-04-03 01:24:49", 
"2021-04-03 01:25:44", "2021-04-03 01:26:39", "2021-04-03 01:28:29", 
"2021-04-03 01:29:24", "2021-04-03 01:30:19", "2021-04-03 01:31:14", 
"2021-04-03 01:32:09", "2021-04-03 01:33:04", "2021-04-03 01:33:59", 
"2021-04-03 01:35:49", "2021-04-03 01:36:44", "2021-04-03 01:37:39", 
"2021-04-03 01:38:34", "2021-04-03 01:39:29", "2021-04-03 01:40:24", 
"2021-04-03 01:42:14", "2021-04-03 01:43:09", "2021-04-03 01:44:04", 
"2021-04-03 01:44:59", "2021-04-03 01:45:54", "2021-04-03 01:46:49", 
"2021-04-03 01:47:44", "2021-04-03 01:48:39", "2021-04-03 01:49:34", 
"2021-04-03 01:50:29", "2021-04-03 01:51:24", "2021-04-03 01:52:19", 
"2021-04-03 01:53:14", "2021-04-03 01:54:09", "2021-04-03 01:55:59", 
"2021-04-03 01:56:54", "2021-04-03 01:57:49", "2021-04-03 01:58:44", 
"2021-04-03 01:59:39", "2021-04-03 02:00:34", "2021-04-03 02:01:29", 
"2021-04-03 02:03:19", "2021-04-03 02:04:14", "2021-04-03 02:05:09", 
"2021-04-03 02:06:04", "2021-04-03 02:06:59", "2021-04-03 02:07:54", 
"2021-04-03 02:08:49", "2021-04-03 02:09:44", "2021-04-03 02:11:34", 
"2021-04-03 02:12:29", "2021-04-03 02:13:24", "2021-04-03 02:14:19", 
"2021-04-03 02:15:14", "2021-04-03 02:16:09", "2021-04-03 02:17:04", 
"2021-04-03 02:18:54", "2021-04-03 02:19:49", "2021-04-03 02:20:44", 
"2021-04-03 02:21:39", "2021-04-03 02:22:34", "2021-04-03 02:23:29", 
"2021-04-03 02:24:24", "2021-04-03 02:25:19", "2021-04-03 02:26:14", 
"2021-04-03 02:27:09", "2021-04-03 02:28:04", "2021-04-03 02:28:59", 
"2021-04-03 02:29:54", "2021-04-03 02:30:49", "2021-04-03 02:31:44", 
"2021-04-03 02:32:39", "2021-04-03 02:33:34", "2021-04-03 02:34:29", 
"2021-04-03 02:35:24", "2021-04-03 02:36:19", "2021-04-03 02:37:14", 
"2021-04-03 02:38:09", "2021-04-03 02:39:04", "2021-04-03 02:39:59", 
"2021-04-03 02:40:54", "2021-04-03 02:41:49", "2021-04-03 02:42:44", 
"2021-04-03 02:43:39", "2021-04-03 02:44:34", "2021-04-03 02:45:29", 
"2021-04-03 02:46:24", "2021-04-03 02:47:19", "2021-04-03 02:48:14", 
"2021-04-03 02:49:09", "2021-04-03 02:50:04", "2021-04-03 02:50:59", 
"2021-04-03 02:51:54", "2021-04-03 02:52:49", "2021-04-03 02:53:44", 
"2021-04-03 02:54:39", "2021-04-03 02:55:34", "2021-04-03 02:56:29", 
"2021-04-03 02:57:24", "2021-04-03 02:58:19", "2021-04-03 02:59:14", 
"2021-04-03 03:00:09", "2021-04-03 03:01:04", "2021-04-03 03:01:59", 
"2021-04-03 03:02:54", "2021-04-03 03:03:49", "2021-04-03 03:04:44", 
"2021-04-03 03:05:39", "2021-04-03 03:06:34", "2021-04-03 03:07:29", 
"2021-04-03 03:08:24", "2021-04-03 03:09:19", "2021-04-03 03:10:14", 
"2021-04-03 03:11:09", "2021-04-03 03:12:04", "2021-04-03 03:12:59", 
"2021-04-03 03:13:54", "2021-04-03 03:14:49", "2021-04-03 03:15:44", 
"2021-04-03 03:16:39", "2021-04-03 03:17:34", "2021-04-03 03:18:29", 
"2021-04-03 03:19:24", "2021-04-03 03:20:19", "2021-04-03 03:21:14", 
"2021-04-03 03:22:09", "2021-04-03 03:23:04", "2021-04-03 03:23:59"
), jar = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L
), measurement_type = c("a", "a", "a", "a", "a", "a", "a", "a", 
"a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", 
"a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", 
"a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", 
"a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", 
"a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", 
"a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", 
"a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", 
"a", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", 
"b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", 
"b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", 
"b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", 
"b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", 
"b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", 
"b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", 
"b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", 
"b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", 
"b", "b", "b", "b", "b")), class = "data.frame", row.names = c(NA, 
-221L))

这是我期望 jar 1 的样子:

expected_df <- structure(list(time = c("2021-04-02 23:57:45", "2021-04-02 23:58:40", 
"2021-04-03 00:00:30", "2021-04-03 00:01:25", "2021-04-03 00:02:20", 
"2021-04-03 00:03:15", "2021-04-03 00:04:10", "2021-04-03 00:06:00", 
"2021-04-03 00:06:55", "2021-04-03 00:53:40", "2021-04-03 00:55:30", 
"2021-04-03 00:56:25", "2021-04-03 00:57:20", "2021-04-03 00:58:15", 
"2021-04-03 00:59:10", "2021-04-03 01:00:05", "2021-04-03 01:01:00", 
"2021-04-03 01:02:50"), jar = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), measurement_type = c("a", 
"a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", 
"a", "a", "a", "a")), class = "data.frame", row.names = c(NA, 
-18L))

标签: rtidyverse

解决方案


看起来您只使用时间,而忽略了日期。如果是这样,您可以使用ITimefrom data.table(或者您可以将日期设置为同一日期),以便比较时间。

在这种情况下,如果要处理每个jarmeasurement_type组合,请使用group_by,然后filter基于时间。由于时间以秒为单位,我乘以 60 分钟,参考最新(或max)时间。

最后一个filter被添加到只是为了重现与帖子中相同的结果,并且可以删除。

library(tidyverse)
library(data.table)

df %>%
  mutate(time_only = as.ITime(time)) %>%
  group_by(jar, measurement_type) %>%
  filter(time_only > max(time_only) - (60 * 12),
         time_only < max(time_only) - (60 * 2)) %>%
  filter(jar == 1, measurement_type == "a")

输出

   time                  jar measurement_type time_only
   <chr>               <int> <chr>            <ITime>  
 1 2021-06-06 12:35:45     1 a                12:35:45 
 2 2021-06-15 12:36:30     1 a                12:36:30 
 3 2021-06-18 12:36:45     1 a                12:36:45 
 4 2021-06-24 12:37:15     1 a                12:37:15 
 5 2021-06-26 12:37:25     1 a                12:37:25 
 6 2021-06-29 12:37:40     1 a                12:37:40 
 7 2021-07-06 12:38:15     1 a                12:38:15 
 8 2021-07-12 12:38:45     1 a                12:38:45 
 9 2021-07-21 12:39:30     1 a                12:39:30 
10 2021-08-05 12:40:45     1 a                12:40:45 
11 2021-08-29 12:42:45     1 a                12:42:45 
12 2021-09-08 12:43:35     1 a                12:43:35 
13 2021-09-14 12:44:05     1 a                12:44:05 
14 2021-09-22 12:44:45     1 a                12:44:45

推荐阅读