r - 使用多组日期对 R 数据框进行子集化
问题描述
我有以下数据集:
ID dates d1 d2 d3 d4
X1 2007-09-09 09:00:00 2007-09-10 09:00:00 2007-09-11 09:00:00 <NA> <NA>
X1 2007-09-10 09:00:00 2007-09-10 09:00:00 2007-09-11 09:00:00 <NA> <NA>
X1 2007-09-11 09:00:00 2007-09-10 09:00:00 2007-09-11 09:00:00 <NA> <NA>
X1 2007-09-13 09:00:00 2007-09-10 09:00:00 2007-09-11 09:00:00 <NA> <NA>
X2 2007-10-09 09:00:00 2007-10-08 09:00:00 2007-10-10 09:00:00 2007-10-13 09:00:00 2007-10-16 09:00:00
X2 2007-10-10 09:00:00 2007-10-08 09:00:00 2007-10-10 09:00:00 2007-10-13 09:00:00 2007-10-16 09:00:00
X2 2007-10-11 09:00:00 2007-10-08 09:00:00 2007-10-10 09:00:00 2007-10-13 09:00:00 2007-10-16 09:00:00
X2 2007-10-14 09:00:00 2007-10-08 09:00:00 2007-10-10 09:00:00 2007-10-13 09:00:00 2007-10-16 09:00:00
X2 2007-10-15 09:00:00 2007-10-08 09:00:00 2007-10-10 09:00:00 2007-10-13 09:00:00 2007-10-16 09:00:00
X2 2007-10-20 09:00:00 2007-10-08 09:00:00 2007-10-10 09:00:00 2007-10-13 09:00:00 2007-10-16 09:00:00
我的目标是将数据子集为两个数据集,例如其中一个具有 d1 和 d2 之间以及 d3 和 d4 之间的所有日期,另一个具有所有剩余日期。
结果如下:
data1(d1、d2、d3、d4 之间的日期):
ID dates d1 d2 d3 d4
X1 2007-09-10 09:00:00 2007-09-10 09:00:00 2007-09-11 09:00:00 <NA> <NA>
X1 2007-09-11 09:00:00 2007-09-10 09:00:00 2007-09-11 09:00:00 <NA> <NA>
X2 2007-10-09 09:00:00 2007-10-08 09:00:00 2007-10-10 09:00:00 2007-10-13 09:00:00 2007-10-16 09:00:00
X2 2007-10-10 09:00:00 2007-10-08 09:00:00 2007-10-10 09:00:00 2007-10-13 09:00:00 2007-10-16 09:00:00
X2 2007-10-14 09:00:00 2007-10-08 09:00:00 2007-10-10 09:00:00 2007-10-13 09:00:00 2007-10-16 09:00:00
X2 2007-10-15 09:00:00 2007-10-08 09:00:00 2007-10-10 09:00:00 2007-10-13 09:00:00 2007-10-16 09:00:00
data2(剩余日期):
ID dates d1 d2 d3 d4
X1 2007-09-11 09:00:00 2007-09-10 09:00:00 2007-09-11 09:00:00 <NA> <NA>
X1 2007-09-13 09:00:00 2007-09-10 09:00:00 2007-09-11 09:00:00 <NA> <NA>
X2 2007-10-11 09:00:00 2007-10-08 09:00:00 2007-10-10 09:00:00 2007-10-13 09:00:00 2007-10-16 09:00:00
X2 2007-10-20 09:00:00 2007-10-08 09:00:00 2007-10-10 09:00:00 2007-10-13 09:00:00 2007-10-16 09:00:00
我有一个简单的方法可以做到这一点吗?这是我的第一个数据集的代码,因此您可以重现它:
ID<-rep(c("X1","X2"),times=c(4,6))
dates<-c("2007-09-09 09:00:00","2007-09-10 09:00:00","2007-09-11 09:00:00","2007-09-13 09:00:00","2007-10-09 09:00:00","2007-10-10 09:00:00","2007-10-11 09:00:00","2007-10-14 09:00:00", "2007-10-15 09:00:00","2007-10-20 09:00:00")
d1<-rep(c("2007-09-10 09:00:00","2007-10-08 09:00:00"),times=c(4,6))
d2<-rep(c("2007-09-11 09:00:00","2007-10-10 09:00:00"),times=c(4,6))
d3<-rep(c(NA,"2007-10-13 09:00:00"),times=c(4,6))
d4<-rep(c(NA,"2007-10-16 09:00:00"),times=c(4,6))
data<-data.frame(ID,dates,d1,d2,d3,d4)
解决方案
您需要Date
首先使用将日期从字符转换为对象as.Date
。然后用于dput()
以紧凑格式提供数据以进行发布:
data <- structure(list(dates = structure(c(13765, 13766, 13767, 13769,
13795, 13796, 13797, 13800, 13801, 13806), class = "Date"),
d1 = structure(c(13766, 13766, 13766, 13766, 13794, 13794, 13794, 13794,
13794, 13794), class = "Date"), d2 = structure(c(13767, 13767, 13767, 13767,
13796, 13796, 13796, 13796, 13796, 13796), class = "Date"), d3 = structure(c(NA,
NA, NA, NA, 13799, 13799, 13799, 13799, 13799, 13799), class = "Date"),
d4 = structure(c(NA, NA, NA, NA, 13802, 13802, 13802, 13802,
13802, 13802), class = "Date")), class = "data.frame", row.names = c(NA, -10L))
现在设置您的选择标准并使用它们来创建data1
和data2
:
select1 <- with(data, dates >= d1 & dates <= d2)
select2 <- with(data, dates >= d3 & dates <= d4)
select2 <- ifelse(is.na(select2), TRUE, select2)
select <- select1 & select2
(data1 <- data[select, ])
# dates d1 d2 d3 d4
# 2 2007-09-10 2007-09-10 2007-09-11 <NA> <NA>
# 3 2007-09-11 2007-09-10 2007-09-11 <NA> <NA>
(data2 <- data[!select,])
# dates d1 d2 d3 d4
# 1 2007-09-09 2007-09-10 2007-09-11 <NA> <NA>
# 4 2007-09-13 2007-09-10 2007-09-11 <NA> <NA>
# 5 2007-10-09 2007-10-08 2007-10-10 2007-10-13 2007-10-16
# 6 2007-10-10 2007-10-08 2007-10-10 2007-10-13 2007-10-16
# 7 2007-10-11 2007-10-08 2007-10-10 2007-10-13 2007-10-16
# 8 2007-10-14 2007-10-08 2007-10-10 2007-10-13 2007-10-16
# 9 2007-10-15 2007-10-08 2007-10-10 2007-10-13 2007-10-16
# 10 2007-10-20 2007-10-08 2007-10-10 2007-10-13 2007-10-16
推荐阅读
- postgresql - Postgis pg_stat_statements 错误
- node.js - discord.js 中的静音命令。到目前为止,这是我的代码,但它还不起作用
- cassandra - cassandra:搜索字段(类型集)为空的记录
- reactjs - 无法在返回的组件中使用挂钩
- reactjs - 通过 javascript 触发的 Hotjar 调查与 Google 标记管理器未在第一次历史更改时加载
- node.js - 在 React 中将 beforeunload 事件监听器添加到窗口
- reactjs - React - 如何在选择菜单中显示数据值
- django - Django POST问题
- arrays - 为什么在索引 Garray 时缺少一个字符?
- html - 在传单地图顶部显示选择选项