首页 > 解决方案 > 根据另一个表中给出的开始时间和结束时间过滤一个表中的时间戳行

问题描述

我有一个包含停机数据的表,看起来像这样

| Machine No | Start Time       | End Time         |
|------------|------------------|------------------|
| H18        | 01-01-2021 12:05 | 01-01-2021 12:15 |
| H19        | 02-01-2021 11:15 | 02-01-2021 13:15 |
| H20        | 01-01-2021 11:15 | 01-01-2021 13:15 |
| H21        | 02-01-2021 09:15 | 02-01-2021 13:55 |
| H22        | 02-01-2021 10:25 | 02-01-2021 10:35 |

我有一个看起来像这样的价值流数据,它基本上是为所有机器附加的

| Machine No | timestamp        | Value |
|------------|------------------|-------|
| H18        | 01-01-2021 12:00 | 34    |
| H18        | 01-01-2021 12:01 | 74    |
| H18        | 01-01-2021 12:02 | 43    |
| H18        | 01-01-2021 12:03 | 60    |
| H18        | 01-01-2021 12:04 | 68    |
| H18        | 01-01-2021 12:05 | 17    |
| H18        | 01-01-2021 12:06 | 38    |
| H18        | 01-01-2021 12:07 | 91    |
| H18        | 01-01-2021 12:08 | 65    |
| H18        | 01-01-2021 12:09 | 80    |
| H18        | 01-01-2021 12:10 | 67    |
| H18        | 01-01-2021 12:11 | 78    |
| H18        | 01-01-2021 12:12 | 43    |
| H18        | 01-01-2021 12:13 | 53    |
| H18        | 01-01-2021 12:14 | 92    |
| H18        | 01-01-2021 12:15 | 11    |
| H18        | 01-01-2021 12:16 | 75    |
| H18        | 01-01-2021 12:17 | 61    |
| H18        | 01-01-2021 12:18 | 82    |
| H18        | 01-01-2021 12:19 | 50    |
| H18        | 01-01-2021 12:20 | 65    |
| H18        | 01-01-2021 12:21 | 23    |
| H18        | 01-01-2021 12:22 | 80    |
| H18        | 01-01-2021 12:23 | 55    |
| H18        | 01-01-2021 12:24 | 61    |
| H18        | 01-01-2021 12:25 | 11    |
| H18        | 01-01-2021 12:26 | 98    |

我想从价值流表中删除包含停机时间数据表中提到的开始时间和结束时间之间的数据的行。我如何在 R 中实现这一点?

标签: rdplyr

解决方案


您可以加入df1and df2by Machine.No,将列转换为格式并仅保留andPOSIXct之外的行。Start.TimeEnd.Time

library(dplyr)

df1 %>%
  inner_join(df2, by = 'Machine.No') %>%
  mutate(across(c(Start.Time,  End.Time,timestamp), lubridate::dmy_hm)) %>%
  filter(!(timestamp >= Start.Time & timestamp <= End.Time))

或在基础 R 中:

res <- merge(df1, df2, by = 'Machine.No')
res[2:4] <- lapply(res[2:4], as.POSIXct, format = '%d-%m-%Y %H:%M', tz = 'UTC')
subset(res, !(timestamp >= Start.Time & timestamp <= End.Time))

推荐阅读