首页 > 解决方案 > 根据时间戳加入两个数据集并组合它们的描述符变量

问题描述

我有两个像这些示例数据(df1 和 df2)这样的数据框。我需要加入两个数据集以获取两个数据帧 df1 和 df2,并结合它们的描述符变量(运动),因此每当任何状态发生变化时,都会在 df1 上创建一个新记录:

这里的目的是为每个观看间隔添加运动状态变量,我想要这个结果:

date    id  viewing_start   viewing_end motion
2019-01-01  18404885    155900  155959  ON
2019-01-01  18404885    160400  160859  ON
2019-01-01  18404885    170100  170259  ON
2019-01-01  18404885    170400  170459  ON
2019-01-01  18404885    170500  171259  ON
2019-01-08  18404885    201100  201859  idle

我的数据集( viweing_start、viewing_end、start_time 和 end_time 最初是 H:M:S 格式(时间)

   df1 <- data.frame(
  date = c("2019-01-01", "2019-01-01", "2019-01-01", "2019-01-01",
           "2019-01-01", "2019-01-08", "2019-01-08", "2019-01-08",
           "2019-01-08", "2019-01-08", "2019-01-08", "2019-01-08",
           "2019-01-08", "2019-01-08", "2019-01-08", "2019-01-08",
           "2019-01-08", "2019-01-08", "2019-01-08"),
  id = c(18404885, 18404885, 18404885, 18404885, 18404885,
         18404885, 18404885, 18404885, 18404885, 18404885,
         18404885, 18404885, 18404885, 18404885, 18404885, 18404885,
         18404885, 18404885, 18404885),
  viewing_start = c(155900, 160400, 170100, 170400, 170500, 201100, 202000,
                    203300, 203700, 204100, 204900, 205200, 205600, 210000,
                    210200, 210800, 211700, 212400, 212900),
  viewing_end = c(155959, 160859, 170259, 170459, 171259, 201859, 202159,
                  203459, 203859, 204759, 204959, 205259, 205659, 210059,
                  210659, 211659, 211859, 212759, 220259)
) %>% 
  mutate(viewing_start = anytime::anytime(paste0(date," ", viewing_start), tz = "CET"),
         viewing_end = anytime::anytime(paste0(date," ", viewing_end) , tz = "CET")) 


df2 <- data.frame(stringsAsFactors=FALSE,
                  id = c(18404885, 18404885, 18404885, 18404885, 18404885, 18404885,
                         18404885, 18404885, 18404885, 18404885, 18404885),
                  date = c("2019-01-01", "2019-01-01", "2019-01-01", "2019-01-01",
                           "2019-01-01", "2019-01-08", "2019-01-08", "2019-01-08",
                           "2019-01-08", "2019-01-08", "2019-01-08"),
                  start_time = c("030000", "033715", "080615", "225215", "250000",
                                 "030000", "033700", "082915", "204100", "205100",
                                 "220315"),
                  end_time = c("033715", "080615", "225215", "250000", "030000",
                               "033700", "082915", "204100", "205100", "220315",
                               "241245"),
                  motion = c("idle", "idle", "ON", "idle", "idle", "idle", "idle", "ON",
                             "WARN", "OFF", "ON")) %>% 
  mutate( start_time = paste0(date," ", as.character(start_time)),
          end_time = paste0(date," ", as.character(end_time))) %>% 
mutate(start_time = anytime::anytime(start_time),
       end_time = anytime::anytime(end_time))

标签: rjoinmergedplyrtidyverse

解决方案


推荐阅读