首页 > 解决方案 > 将重叠的日期范围转换为日期范围的行

问题描述

我有一个可以打开的表中的日期范围列表(enddate = null):

Index   startdate   enddate
1       2018-07-13  NULL
2       2018-11-14  2018-11-16
3       2018-11-15  2018-11-15

查询测试数据:

DECLARE @ScheduleTable Table([Index] int not null, StartDate DateTime not null, EndDate DateTime null)
insert into @ScheduleTable ([Index], StartDate, EndDate)
values
(1,'2018-07-13',null)
, (2,'2018-11-14','2018-11-16')
, (3,'2018-11-15','2018-11-15')
select*from @ScheduleTable

如何编写一个“填补漏洞”并返回以下结果的查询:

Index   startdate   enddate
1       2018-07-13  2018-11-13
2       2018-11-14  2018-11-14
3       2018-11-15  2018-11-15
2       2018-11-16  2018-11-16
1       2018-11-17  NULL

查询显示预期结果:

select 
1 as [Index], '2018-07-13' as StartDate, '2018-11-13' as EndDate
UNION ALL
select 
2 as [Index], '2018-11-14', '2018-11-14'
UNION ALL
select 
3 as [Index], '2018-11-15', '2018-11-15'
UNION ALL
select 
2 as [Index], '2018-11-16', '2018-11-16'
UNION ALL
select 
1 as [Index], '2018-11-17', null




我更喜欢不涉及参数/临时表等的答案。如果有帮助,我有一个日期维度表。

在上面的示例中,Index=1 的条目是开放式的,从 7.13 开始。它在 11.14 被 Index=2 中断。Index=2 然后在 11.15 被 Index=3 中断。Index=2 然后在 11.16 重新开始。随后 Index=1 在 11.17 再次启动

Index 确定优先顺序,因此 Index=2 将在 11.14 - 11.16 上覆盖 Index=1,Index=3 将在 11.15 上覆盖 Index=2。


这是我当前使用 Lead() 的查询:

DECLARE @MinDate DateTime = '2015-01-01'
DECLARE @MaxDate DateTime = '2020-01-01'

select
row_number() over(partition by dealid order by ss.StartDate, ss.id) as [Index]
, ss.startdate
, ss.enddate
, case when ss.enddate is null then
    dateadd(d,-1,lead(ss.startdate,1,@MaxDate) over(partition by dealid order by ss.startdate, ss.id)) 
    else ss.enddate end
    as EndDate
from
[dbo].[Schedule]ss
where ss.enabled = 1

标签: tsqldatetime

解决方案


我能够使用以下方法解决问题:

脚步:

  • 从 DateDimension 表中获取日期表
  • 将计划加入 DateDimension 表
  • row_number 计划以确定给定日期哪个优先
  • 排名结果,按日期排序
  • dense_rank 结果,按计划 ID 分区,按日期排序
  • 从排名结果中减去dense_rank 结果,为每组连续日期创建一个唯一ID
  • 获取每个日期范围的最小和最大日期

查询填充测试数据:

DECLARE @ScheduleTable Table([Index] int not null, StartDate DateTime not null, EndDate DateTime null)
insert into @ScheduleTable ([Index], StartDate, EndDate)
values
(1,'2018-07-13',null)
, (2,'2018-11-14','2018-11-16')
, (3,'2018-11-15','2018-11-15')

解决方案:

DECLARE @MinDate Date = dateadd(year,-2,getdate())
DECLARE @MaxDate DateTime = dateadd(year,2,getdate())

select 
min(dt) as StartDate
, max(dt) as EndDate
, dense_rank() over(Order by [Index]) [Index]
from
(
select
--Create "groups" using a raw Rank minus dense_rank, partitioned by [Index]
rank() over(order by dt) - dense_rank() over(partition by [Index] order by dt) qlt,
[Index], dt
from
(
select
--Apply row_number to identify which schedule takes precedence on a given day
--Index=2 takes precedence over Index=1
row_number() over(partition by inr.[date] order by ss.[Index] desc) rm,
[date] dt
, ss.*
from
(
    --Obtain Table of Dates from DateDimension table
    select
    [date]
    from
    [dbo].[DateDimension]dd
    where dd.[date] >= @MinDate
    and dd.[date] <= @MaxDate
)inr
    --join schedules to DateDimension table
    left join 
    (
        select *
        from
        @ScheduleTable
    )ss
        on ss.StartDate <= inr.[date]
        and (ss.enddate >= inr.[date]
            or ss.enddate is null)

)inr2
--Exclude any Schedule that is not row_number=1
where inr2.rm = 1
and inr2.[Index] is not null
)inr3
--Group on Index first then Rank minus dense_rank, partitioned by [Index]
group by [Index], qlt
order by StartDate
, [Index]

结果:

StartDate   EndDate     Index
2018-07-13  2018-11-13  1
2018-11-14  2018-11-14  2
2018-11-15  2018-11-15  3
2018-11-16  2018-11-16  2
2018-11-17  2020-11-12  1

推荐阅读