r - R中posixlt之间的循环
问题描述
尝试随时间循环时,我在 R 中遇到错误。这是我的数据框的一个子集(包含 120000 行)。
time value mean group
1 2017-01-01 12:00:00 0.507 0.5106533 NA
2 2017-01-01 12:05:00 0.526 0.5106533 NA
3 2017-01-01 12:10:00 0.489 0.5106533 NA
4 2017-01-01 12:15:00 0.598 0.5106533 NA
5 2017-01-01 12:20:00 0.564 0.5106533 NA
6 2017-01-01 12:25:00 0.536 0.5106533 NA
假设我想根据时间段创建组,预期结果如下:
time value mean group
1 2017-01-01 12:00:00 0.507 0.5106533 A
2 2017-01-01 12:05:00 0.526 0.5106533 A
3 2017-01-01 12:10:00 0.489 0.5106533 B
4 2017-01-01 12:15:00 0.598 0.5106533 B
5 2017-01-01 12:20:00 0.564 0.5106533 C
6 2017-01-01 12:25:00 0.536 0.5106533 C
我尝试了以下代码:
for (i in 1:length(merged.data$group)){
if (merged.data[as.POSIXlt(i)$time >= "2017-05-15 12:00:00 GMT" &
as.POSIXlt(i)$time <= "2017-05-29 12:00:00 GMT",]){
merged.data$group == "A"}
else if (merged.data[as.POSIXlt(i)$time >= "2017-08-11 12:00:00" &
as.POSIXlt(i)$time <= "2017-11-29 16:00:00",]){
merged.data$group == "B"}
else if (merged.data[as.POSIXlt(i)$time >= "2018-01-05 12:00:00" &
as.POSIXlt(i)$time <= "2018-02-16 16:00:00",]){
merged.data$group == "C"}
}
我收到以下错误:
Error in as.POSIXlt.numeric(i) : 'origin' must be supplied
我不明白,我认为 POSIXlt 正在摆脱起源问题?虽然,我承认我对 R 中时间问题的理解有点混乱,每次我需要处理时间/日期时,我都很难编码......
所以我希望有人可以帮助我,如果我不清楚或者是否需要更多/更好的信息来回答我的问题,请随时告诉我。
谢谢你提前stackoverflowers!
解决方案
数据表方法...
样本数据
library( data.table )
dt <- fread("time value mean
2017-01-01T12:00:00 0.507 0.5106533
2017-01-01T12:05:00 0.526 0.5106533
2017-01-01T12:10:00 0.489 0.5106533
2017-01-01T12:15:00 0.598 0.5106533
2017-01-01T12:20:00 0.564 0.5106533
2017-01-01T12:25:00 0.536 0.5106533 ", header = TRUE)
dt[, time := as.POSIXct( time, format = "%Y-%m-%dT%H:%M:%S" )]
代码
library( data.table )
library( lubridate )
dt[, group := LETTERS[.GRP], by = lubridate::floor_date( time, "10 mins" ) ]
# time value mean group
# 1: 2017-01-01 12:00:00 0.507 0.5106533 A
# 2: 2017-01-01 12:05:00 0.526 0.5106533 A
# 3: 2017-01-01 12:10:00 0.489 0.5106533 B
# 4: 2017-01-01 12:15:00 0.598 0.5106533 B
# 5: 2017-01-01 12:20:00 0.564 0.5106533 C
# 6: 2017-01-01 12:25:00 0.536 0.5106533 C
更新
方法使用foverlaps
,基于提供的示例数据和代码
library( data.table )
#create lookup-table with periods and group-names
periods.dt <- data.table(
start = as.POSIXct( c( "2017-05-15 12:00:00", "2017-08-11 12:00:00", "2018-01-05 12:00:00" ), tz = "GMT" ),
stop = as.POSIXct( c( "2017-08-11 12:00:00", "2018-01-05 12:00:00", "2018-02-16 16:00:00"), tz = "GMT" ),
group = LETTERS[1:3] )
#set keys
setkey( periods.dt, start, stop )
#create sample data
dt <- fread("time value mean
2017-01-01T12:00:00 0.507 0.5106533
2017-01-01T12:05:00 0.526 0.5106533
2017-01-01T12:10:00 0.489 0.5106533
2017-01-01T12:15:00 0.598 0.5106533
2017-01-01T12:20:00 0.564 0.5106533
2017-01-01T12:25:00 0.536 0.5106533 ", header = TRUE)
dt[, time := as.POSIXct( time, format = "%Y-%m-%dT%H:%M:%S", tz = "GMT" )]
#create dummies to join on
dt[, `:=`( start = time, stop = time )]
#perform overlap join, no match --> NA
foverlaps( dt, periods.dt, type = "within", nomatch = NA)[, c("time", "value","mean","group"), with = FALSE]
# time value mean group
# 1: 2017-01-01 12:00:00 0.507 0.5106533 <NA>
# 2: 2017-01-01 12:05:00 0.526 0.5106533 <NA>
# 3: 2017-01-01 12:10:00 0.489 0.5106533 <NA>
# 4: 2017-01-01 12:15:00 0.598 0.5106533 <NA>
# 5: 2017-01-01 12:20:00 0.564 0.5106533 <NA>
# 6: 2017-01-01 12:25:00 0.536 0.5106533 <NA>
推荐阅读
- scala - Scala RxJava 参数表达式的类型与形参类型不兼容
- javascript - Javascript - 回调
- ruby-on-rails - Rails - 从 HTML 动态生成 PDF 并显示下载链接
- java - 接口混合继承不会在java中抛出错误?
- r - 使用 for 循环(r)创建变量以计算行间“是”的出现次数
- java - 关于理解 Apache Beam 的 wordCount 示例的问题
- ibm-watson - 在 WKS 上使用负面提及类
- elasticsearch - ElasticSearch 查询,匹配某个术语并在给定日期范围内计数
- sql - 如何将 TIMESTAMP 列中的时间与字符串格式列中的时间 (hH:MM:SS) 进行比较
- python - 为大型数据集组合日期、小时和间隔列的更快方法