r - 两个日期期间的部分或全部重叠
问题描述
我想知道在特定时间段内哪些观察是活跃的
我的真正目标是了解患者在怀孕期间(9 个月)期间哪些诊断 (dx) 处于活动状态。必须考虑到患者一生中可能有不同的怀孕次数以及不同的诊断次数(dx 可以或不能激活)。
我已经尝试过像这里或这里这样的折叠,但并不是我想要的。真正的问题是这个,它有据可查,但不适用于 R。
在这里,他们使它适用于我认为的 SQL。所以希望能解决...
我也尝试过像这样的非 Equi 连接,但我无法让它按照我想要的方式工作......
让我们开始吧:
我有一个带有患者(id)诊断(dx)、注册时间(InD_dx)和结束时间(EndD_dx)的数据库,如下所示:
id <- rep("a", 11)
InD_dx <- as.Date(c("2005-10-04","2005-10-06","2005-10-06", "2008-04-07", "2010-05-10", "2012-04-24", "2012-04-24", "2012-05-15", "2014-03-20", "2014-04-22", "2017-11-30"), format = "%Y-%m-%d")
EndD_dx <- as.Date(c("2020-12-31","2020-12-31","2020-12-31", "2008-11-05", "2011-01-17", "2015-07-20", "2013-01-01", "2015-07-20", "2015-12-04", "2020-12-31", "2020-12-31"), format = "%Y-%m-%d")
dx <- c("A", "B", "C", "P", "P", "D", "P", "E", "F", "G", "H")
DT = data.table(id,InD_dx,EndD_dx, dx)
DT
id InD_dx EndD_dx dx
1: a 2005-10-04 2020-12-31 A
2: a 2005-10-06 2020-12-31 B
3: a 2005-10-06 2020-12-31 C
4: a 2008-04-07 2008-11-05 P
5: a 2010-05-10 2011-01-17 P
6: a 2012-04-24 2015-07-20 D
7: a 2012-04-24 2013-01-01 P
8: a 2012-05-15 2015-07-20 E
9: a 2014-03-20 2015-12-04 F
10: a 2014-04-22 2020-12-31 G
11: a 2017-11-30 2020-12-31 H
我选择了怀孕(P)的dx并分开了一个DB。还有一个 dx 与 P 不同的数据库,如下所示:
Pregnancies <- copy(DT[dx== "P"])
Pregnancies
id InD_dx EndD_dx dx
1: a 2008-04-07 2008-11-05 P
2: a 2010-05-10 2011-01-17 P
3: a 2012-04-24 2013-01-01 P
Dx_Other_than_Pregnancies <- copy(DT[dx!= "P"])
Dx_Other_than_Pregnancies
id InD_dx EndD_dx dx
1: a 2005-10-04 2020-12-31 A
2: a 2005-10-06 2020-12-31 B
3: a 2005-10-06 2020-12-31 C
4: a 2012-04-24 2015-07-20 D
5: a 2012-05-15 2015-07-20 E
6: a 2014-03-20 2015-12-04 F
7: a 2014-04-22 2020-12-31 G
8: a 2017-11-30 2020-12-31 H
我期待将它们合并并通过怀孕对主动诊断进行分组。这就是我尝试使用 foverlap 的原因:
setkey(Dx_Other_than_Pregnancies, id, InD_dx, EndD_dx)
foverlaps(Pregnancies, Dx_Other_than_Pregnancies,
by.x=c('id', 'InD_dx', 'EndD_dx'), type='within', mult= "all", nomatch = 0L)
和非 Equi 连接:
Dx_Other_than_Pregnancies[Pregnancies, on=.(id, dx_Ini<=dxIni , EndD_dx>=EndD_dx)]
两者最终都像:
id InD_dx EndD_dx dx i.InD_dx i.EndD_dx i.dx
1: a 2005-10-04 2020-12-31 A 2008-04-07 2008-11-05 P
2: a 2005-10-06 2020-12-31 B 2008-04-07 2008-11-05 P
3: a 2005-10-06 2020-12-31 C 2008-04-07 2008-11-05 P
4: a 2005-10-04 2020-12-31 A 2010-05-10 2011-01-17 P
5: a 2005-10-06 2020-12-31 B 2010-05-10 2011-01-17 P
6: a 2005-10-06 2020-12-31 C 2010-05-10 2011-01-17 P
7: a 2005-10-04 2020-12-31 A 2012-04-24 2013-01-01 P
8: a 2005-10-06 2020-12-31 B 2012-04-24 2013-01-01 P
9: a 2005-10-06 2020-12-31 C 2012-04-24 2013-01-01 P
10: a 2012-04-24 2015-07-20 D 2012-04-24 2013-01-01 P
在非 Equi 连接 i.InD_dx 和 i.EndD_dx 的情况下,它不会输出并且 EndD_dx 变为 i.EndD_dx
期望的结果
id InD_dx EndD_dx dx i.InD_dx i.EndD_dx i.dx
1: a 2005-10-04 2020-12-31 A 2008-04-07 2008-11-05 P
2: a 2005-10-06 2020-12-31 B 2008-04-07 2008-11-05 P
3: a 2005-10-06 2020-12-31 C 2008-04-07 2008-11-05 P
4: a 2005-10-04 2020-12-31 A 2010-05-10 2011-01-17 P
5: a 2005-10-06 2020-12-31 B 2010-05-10 2011-01-17 P
6: a 2005-10-06 2020-12-31 C 2010-05-10 2011-01-17 P
7: a 2005-10-04 2020-12-31 A 2012-04-24 2013-01-01 P
8: a 2005-10-06 2020-12-31 B 2012-04-24 2013-01-01 P
9: a 2005-10-06 2020-12-31 C 2012-04-24 2013-01-01 P
10: a 2012-04-24 2015-07-20 D 2012-04-24 2013-01-01 P
11: a 2012-05-15 2015-07-20 E 2012-04-24 2013-01-01 P
我不知道我是不是把事情复杂化了。可能通过 join 1:multiple 并在日期之间产生差异,我会得到我想要的。这样就够效率了吗?
任何帮助将不胜感激
提前致谢!
解决方案
type = 'within'
不包括您正在寻找的一些部分重叠。
尝试:
DT = data.table(id,InD_dx,EndD_dx, dx)
setkey(DT,id,InD_dx,EndD_dx)
foverlaps(DT[dx=='P'],DT[dx!='P'],
by.x = c("id", "InD_dx", "EndD_dx"),
by.y = c("id", "InD_dx", "EndD_dx"))
id InD_dx EndD_dx dx i.InD_dx i.EndD_dx i.dx
1: a 2005-10-04 2020-12-31 A 2008-04-07 2008-11-05 P
2: a 2005-10-06 2020-12-31 B 2008-04-07 2008-11-05 P
3: a 2005-10-06 2020-12-31 C 2008-04-07 2008-11-05 P
4: a 2005-10-04 2020-12-31 A 2010-05-10 2011-01-17 P
5: a 2005-10-06 2020-12-31 B 2010-05-10 2011-01-17 P
6: a 2005-10-06 2020-12-31 C 2010-05-10 2011-01-17 P
7: a 2005-10-04 2020-12-31 A 2012-04-24 2013-01-01 P
8: a 2005-10-06 2020-12-31 B 2012-04-24 2013-01-01 P
9: a 2005-10-06 2020-12-31 C 2012-04-24 2013-01-01 P
10: a 2012-04-24 2015-07-20 D 2012-04-24 2013-01-01 P
11: a 2012-05-15 2015-07-20 E 2012-04-24 2013-01-01 P
推荐阅读
- python-3.x - Pytest:为函数指定多个命名约定
- angular - 在Angular中测量svg元素的尺寸
- c# - 如何在视图包中传递匿名类型结果并在 Razor 视图中渲染相同
- sockets - 从 opensge 应用程序访问服务器套接字
- html - 在按钮上单击检查在角度 4 中选择了哪个单选按钮
- azure - 如何在 Azure 链接服务中连接 Azure 托管实例
- firefox-os - 不能使用 KaiOS 模拟器
- c++ - QPieSlice 的 Qt 坐标
- azure - Azure App Gateway 出现错误 404,但后端探测正常
- objective-c - Objective c ProgressView 没有正确更新