r - 将时差拆分为各个小时
问题描述
我有一个data.table
包含开始和结束日期的日期和时间戳。我知道我可以使用该difftime
函数计算时间差。但我想计算时间差并将其拆分为每一行中的单独小时以及其余数据的副本。例如:
Start_time End_time
1: 2017-01-01 00:15:00 2017-01-01 00:23:00
2: 2017-01-01 00:24:00 2017-02-01 00:10:00
3: 2017-01-01 23:38:00 2017-02-01 00:45:00
4: 2017-01-01 14:38:00 2017-01-01 07:43:00
5: 2017-01-01 06:38:00 2017-01-01 00:43:00
我做了以下提取开始时间和结束时间
Start_time End_time start_hour end hour
1: 2017-01-01 00:15:00 2017-01-01 00:23:00 0 0
2: 2017-01-01 00:24:00 2017-02-01 00:10:00 0 0
3: 2017-01-01 23:38:00 2017-02-01 00:45:00 23 0
4: 2017-01-01 14:38:00 2017-01-01 07:43:00 14 7
5: 2017-01-01 06:38:00 2017-01-01 00:43:00 6 0
now, i have to find all the temperatures between start_hour and end_hour inclusive and populate my table
i have a weather report which is collected hourly with corresponding temperature,
**DateTime** **Temperature** hour
1 2017-01-01 06:00:00 16 6
2 2017-01-01 05:00:00 16 5
3 2017-01-01 04:00:00 16 4
4 2017-01-01 08:00:00 9 8
5 2017-01-01 03:00:00 5 3
6 2017-01-01 00:00:00 8 0
7 2017-01-01 07:00:00 5 7
i have to get all the temperatures between the starttime and end time (inclusive)
ie. if a ride is for 4 hours on a day, 2017-01-01 00:05:00 till 2017-01-01
04:00:00 i need 4 rows with 4 different temperatures 00:00:00 , 01:00:00, 02:00:00, 03:00:00,04:00:00 .
date should also be considered here ex: 2017-01-01 23:45:00 till 2017-02-01 14:45:00 ,
i need 15 rows for corresponding temperature.
i am struggling with issue and couldnot find a solution,kindly help
解决方案
我根据您的数据制作了一个代表,并将 data.table 用于非 equi 连接。这是你预期的结果吗?
library(data.table)
rides <- as.data.table(structure(list(Trip_id = c("17059131", "17059130", "17059129",
"17059128", "17059127", "17059126", "17059125", "17061379", "17061378",
"17063646"), DurationInMinutes = c(8, 12, 17, 5, 5, 20, 12, 48,
47, 25), Start_time = structure(c(1483229700, 1483230240, 1483230480,
1483231080, 1483231080, 1483231140, 1483231380, 1483232160, 1483232220,
1483235640), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
End_time = structure(c(1483230180, 1483230960, 1483231500,
1483231380, 1483231380, 1483232340, 1483232100, 1483235040,
1483235040, 1483237140), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
start_hour = structure(c(1483228800, 1483228800, 1483228800,
1483228800, 1483228800, 1483228800, 1483228800, 1483228800,
1483228800, 1483232400), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
end_hour = structure(c(1483228800, 1483228800, 1483228800,
1483228800, 1483228800, 1483228800, 1483228800, 1483232400,
1483232400, 1483236000), class = c("POSIXct", "POSIXt"), tzone = "UTC")), class = "data.frame", row.names = c(NA,
-10L)))
weather <- as.data.table(structure(list(weather_time = structure(c(1483228800, 1483232400,
1483236000), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
temperature = c("9.90", "9.67", "9.33")), class = "data.frame", row.names = c(NA,
-3L)))
rides_with_temp <- rides[weather, on =.(start_hour <= weather_time, end_hour >= weather_time)]
rides_with_temp
如果需要,请使用所有转换进行编码:
library(readxl)
library(lubridate)
library(tidyverse)
library(data.table)
bikerides_startandendtimes <-
read_excel("C:/Users/pstra/Downloads/bikerides startandendtimes.xlsx") %>%
mutate(Start_time = mdy_hms(Start_time),
End_time = mdy_hms(End_time),
start_hour = floor_date(Start_time, unit = "hours"),
end_hour = floor_date(End_time, unit = "hours")
) %>%
as.data.table()
weatherdata <-
read_excel("C:/Users/pstra/Downloads/weatherdata.xlsx") %>%
mutate(weather_time = ymd_h(paste(year, Month, Day, Hour, sep = "/"))) %>%
select(weather_time, temperature) %>%
as.data.table()
bikerides_startandendtimes[weatherdata, on =.(start_hour <= weather_time, end_hour >= weather_time)] %>%
dput(as.data.frame(head(bikerides_startandendtimes, 10)))
dput(as.data.frame(head(weatherdata, 3)))
问候帕维尔
推荐阅读
- python - 将 N 个不同的向量排列到一个对象数组中 - 不是矩阵
- r - R闪亮:plotly_build中的错误:缺少参数“p”,没有默认值
- swift - 将标签的文本设置为 nil 或将其设置为“”有什么区别?
- ruby-on-rails - 未定义的方法“登录”#
- c# - 仅知道 PropertyInfo 从 Enum 获取属性
- c++ - C++ WinSock2 等到客户端发送数据
- python - 如何在windows上构建tensorflow自定义操作
- r - 有没有一种快速的方法可以将字符串中的罗马数字转换为 R 中的阿拉伯语?
- r - 如何纠正 ICdiffupperwidesIII 分析中缺失值 TRUE/FALSE 的错误
- apache-spark - 从 pyspark 数据框中的结构类型获取字段值