sql-server - 优先合并/拆分重叠的日期范围
问题描述
我有三张桌子。一张表告诉我何时与特定供应商签约。第二个告诉我我们与所有供应商签订的基本费用表。第三个告诉我具体合同是否对其中一项费用有不同的合同费率。表格如下所示:
CREATE TABLE [dbo].[Facility](
[FacilityID] [bigint] IDENTITY(1,1) NOT NULL,
[ProviderID] [varchar](50) NOT NULL,
[VendorID] [bigint] NOT NULL,
[FacilityName] [varchar](300) NOT NULL,
[FacilityAddress1] [varchar](300) NOT NULL,
[FacilityAddress2] [varchar](300) NOT NULL,
[FacilityCity] [varchar](300) NOT NULL,
[FacilityState] [char](2) NOT NULL,
[FacilityZip] [varchar](10) NOT NULL,
[ContractEffectiveDate] [date] NOT NULL,
[ContractTermDate] [date] NOT NULL,
CONSTRAINT [PK_Facility] PRIMARY KEY CLUSTERED
(
[FacilityID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[BaseFeeSchedule](
[BaseFeeScheduleID] [int] IDENTITY(1,1) NOT NULL,
[FeeCode] [varchar](10) NOT NULL,
[Description] [varchar](800) NOT NULL,
[Rate] [money] NOT NULL,
[CategoryID] [int] NOT NULL,
[RateEffectiveDate] [date] NOT NULL,
[RateTermDate] [date] NOT NULL,
CONSTRAINT [PK_BaseFeeSchedule] PRIMARY KEY CLUSTERED
(
[BaseFeeScheduleID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[OverrideFeeSchedule](
[OverrideFeeScheduleID] [bigint] IDENTITY(1,1) NOT NULL,
[FacilityID] [bigint] NOT NULL,
[FeeCode] [varchar](10) NOT NULL,
[OverrideRate] [money] NOT NULL,
[RateEffectiveDate] [date] NOT NULL,
[RateTermDate] [date] NOT NULL,
CONSTRAINT [PK_OverrideFeeSchedule] PRIMARY KEY CLUSTERED
(
[OverrideFeeScheduleID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[OverrideFeeSchedule] WITH CHECK ADD CONSTRAINT [FK_OverrideFeeSchedule_Facility] FOREIGN KEY([FacilityID])
REFERENCES [dbo].[Facility] ([FacilityID])
GO
ALTER TABLE [dbo].[OverrideFeeSchedule] CHECK CONSTRAINT [FK_OverrideFeeSchedule_Facility]
GO
我们有一个现有系统,其中一张表如下所示:
CREATE TABLE [dbo].[FeeSchedule](
[FeeScheduleID] [int] IDENTITY(1,1) NOT NULL,
[VendorID] [int] NULL,
[FeeCd] [varchar](10) NOT NULL,
[StartDate] [date] NOT NULL,
[EndDate] [date] NOT NULL,
[ContractedAmount] [money] NOT NULL,
[ProgramTypeID] [int] NULL,
CONSTRAINT [PK_FeeSchedule] PRIMARY KEY CLUSTERED
(
[FeeScheduleID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
该表在代码中用于确定支付每个供应商的正确费率。我的工作是更新该表,但事实证明这是有问题的,因为不同的设施在不同的日期签订了合同。每份合同都包括基本费用表。但是,合同允许某些费用被不同的费用“覆盖”(当有折扣时通常低于正常的合同费用,当需要添加附加费时偶尔会更高)。这三个表是我构建的存储所有当前数据的表,我一直在使用它们来构建软件所需的 FeeSchedule 表。处理更改很容易,但我的任务是验证 FeeSchedule 表中的数据是否准确。
FeeSchedule 表不仅包括新数据(这是我唯一更改的),还包括以前的数据。因此,计划是获取三个表中的数据,运行查询以合并日期范围(其中 OverrideFeeSchedule 表中的费用优先于 BaseFeeSchedule 表中的费用)
一个例子:
INSERT INTO Facility(VendorID,ContractEffectiveDate,ContractTermDate,...)
VALUES(1,'1/1/2017','12/31/9999',...) --Assume FacilityID=1
INSERT INTO BaseFeeSchedule(FeeCode,Rate,RateEffectiveDate,RateTermDate,...)
VALUES('1',100,'1/1/2015','10/15/2016',...),
('1',120,'10/16/2016','4/5/2018',...),
('1',140,'4/6/2018','12/31/9999',...)
INSERT INTO OverrideFeeSchedule(FacilityID,FeeCode,OverrideRate,RateEffectiveDate,RateTermDate,...)
VALUES(1,'1',50,'3/1/2017','5/31/2018',...),
(1,'1',70,'7/1/2018','12/31/9999',...)
And from this data, I would want:
INSERT INTO FeeSchedule(VendorID, FeeCd, StartDate,EndDate,ContractedAmount)
VALUES(1,'1','1/1/2017','2/28/2017',120), --From BaseFeeSchedule
(1,'1','3/1/2017','5/31/2018',50), --From OverrideFeeSchedule
(1,'1','6/1/2018','6/30/2018',140), --From BaseFeeSchedule
(1,'1','7/1/2018','12/31/9999',70) --From OverrideFeeSchedule
我已验证 OverrideFeeSchedule 表中没有单个 Facility/FeeCode 组合的数据重叠,并且 BaseFeeSchedule 中没有单个 FeeCode 的数据重叠。我目前的解决方案需要永远。我正在执行以下操作:
建立自第一个签约设施开始以来的每一天的表格。(BigTable 只是一个包含大约一百万条记录的表,我只需要从与供应商签订合同的第一个日期到一年后的每一天。但是,由于最大递归大约是 20,000 ,当从第一个签约供应商到今天起一年的范围超过20,000天时,我可能会因为违反最大递归而出错。所以,我希望有一个不同的解决方案。
SELECT DATEADD(DAY,ROW_NUMBER() OVER (ORDER BY A.TableID) - 1,B.MinDate) CheckDate
INTO #DatesToCheck
FROM BigTable A
CROSS JOIN
(SELECT MIN(ContractEffectiveDate) MinDate
FROM Facility) B
WHERE DATEADD(DAY,ROW_NUMBER() OVER (ORDER BY A.TableID) - 1,B.MinDate) < DATEADD(YEAR,1,GETDATE())
将此表与其他表连接起来,构建一个包含每天、当天签约的每个设施、当天应收费的每个 FeeCode 以及当天的具体费率的巨大表。我不会为那个连接的代码烦恼,但写起来并不难。
接下来,我使用此处描述的技术来合并日期范围: StackOverflow
虽然这种技术有效,但速度非常慢。有没有更直接的方法来生成我正在寻找的结果集?基本上,我正在寻找如何修改该链接中的方法,以考虑与不同优先级(基本与覆盖)的潜在重叠,就像我提供的示例中一样。
解决方案
我希望我得到这个正确...
首先,您应该实现一个数字/日期表。这不是绝对必要的,但在许多情况下非常漂亮。你可以按照这个例子...
有了这样的列表,您可以尝试以下内容:
DECLARE @endDate DATE='20191231';
WITH DailyBaseRate AS
(
SELECT CoveredDays.CalendarDate
,CONCAT('base ',bfs.RateEffectiveDate) AS RateKey
,bfs.FeeCode
,bfs.Rate
FROM BaseFeeSchedule bfs
CROSS APPLY(SELECT * FROM RunningNumbers rn WHERE rn.CalendarDate<=@endDate AND rn.CalendarDate>=bfs.RateEffectiveDate AND rn.CalendarDate<=bfs.RateTermDate) CoveredDays
)
,OverrideRates AS
(
SELECT CoveredDays.CalendarDate
,o.FacilityID
,CONCAT('override ',o.RateEffectiveDate) AS RateKey
,o.FeeCode
,o.OverrideRate
FROM OverrideFeeSchedule o
CROSS APPLY(SELECT * FROM RunningNumbers rn WHERE rn.CalendarDate<=@endDate AND rn.CalendarDate>=o.RateEffectiveDate AND rn.CalendarDate<=o.RateTermDate) CoveredDays
)
,EffectiveRates AS
(
SELECT f.*
,dbr.CalendarDate
,COALESCE(ovr.RateKey, dbr.RateKey) AS EffectiveRateKey
,COALESCE(ovr.FeeCode, dbr.FeeCode) AS EffectiveFeeCode
,COALESCE(ovr.OverrideRate, dbr.Rate) AS EffectiveRate
FROM dbo.Facility f
CROSS JOIN DailyBaseRate dbr
LEFT JOIN OverrideRates ovr ON ovr.FacilityID=f.FacilityID AND ovr.CalendarDate=dbr.CalendarDate
WHERE dbr.CalendarDate<=@endDate
AND dbr.CalendarDate>=f.ContractEffectiveDate
AND dbr.CalendarDate<=f.ContractTermDate
)
SELECT FacilityID,FacilityName
,EffectiveRateKey,EffectiveFeeCode,EffectiveRate
,MIN(CalendarDate) AS FromDate
,MAX(CalendarDate) AS ToDate
FROM EffectiveRates
GROUP BY FacilityID,FacilityName,EffectiveRateKey,EffectiveFeeCode,EffectiveRate
ORDER BY FacilityID,FromDate;
结果(我在您的测试数据中添加了第二个工具......)
+------------+--------------+---------------------+------------------+---------------+------------+------------+
| FacilityID | FacilityName | EffectiveRateKey | EffectiveFeeCode | EffectiveRate | FromDate | ToDate |
+------------+--------------+---------------------+------------------+---------------+------------+------------+
| 1 | Fac1 | base 2016-10-16 | 1 | 120,00 | 2017-01-01 | 2017-02-28 |
+------------+--------------+---------------------+------------------+---------------+------------+------------+
| 1 | Fac1 | override 2017-03-01 | 1 | 50,00 | 2017-03-01 | 2018-05-31 |
+------------+--------------+---------------------+------------------+---------------+------------+------------+
| 1 | Fac1 | base 2018-04-06 | 1 | 140,00 | 2018-06-01 | 2018-06-30 |
+------------+--------------+---------------------+------------------+---------------+------------+------------+
| 1 | Fac1 | override 2018-07-01 | 1 | 50,00 | 2018-07-01 | 2019-12-31 |
+------------+--------------+---------------------+------------------+---------------+------------+------------+
| 2 | Fac2 | base 2018-04-06 | 1 | 140,00 | 2019-01-01 | 2019-12-31 |
+------------+--------------+---------------------+------------------+---------------+------------+------------+
| 2 | Fac2 | override 2019-07-01 | 1 | 99,00 | 2019-07-01 | 2019-08-15 |
+------------+--------------+---------------------+------------------+---------------+------------+------------+
简而言之,这个想法
- 第一个 CTE 会将您的基本时间表转换为天数列表(每天一行,使用当前代码和每天的费率)
- 第二个 CTE 将执行相同的操作,但使用覆盖计划
- 第三个 CTE 将 CROSS JOIN 您的设施与基本时间表(如果有很多设施,这可能会变得相当大)并 LEFT JOIN 覆盖率(没有额外的行)
- 集合被过滤到实际使用的范围
- 最后,我们可以按一些列分组,并用 MIN 和 MAX 选择区间边界
提示:我们需要EffectiveRateKey
避免将具有相同速率和代码的不同间隔组合在一起。作为副作用,您可以看到,费率是从哪个来源获取的。
提示 2:由于我们永远不知道引擎会按什么顺序工作,所以考虑一下索引,使用(索引)临时表而不是 CTE 可能会有很大帮助......
推荐阅读
- python - Python - 解析字符串格式的字符串列表
- javascript - 如何使用 select 中的参数调用 onchange 事件
- powershell - Run executable in powershell without waiting for return
- python - 解决“错误:+ ("Iterable[str]") [operator] 的左操作数类型不受支持”
- c# - Visual Studio:如何轻松地将方法及其调用者转换为异步
- postgresql - 当窗口函数结果相等时使用另一个顺序
- git - 在 github-actions 中推送新提交(更新当前)时如何取消 PR 中的先前运行
- javascript - 在不假设有问题的情况下获取用户输入
- node.js - 节点加密而不是 JSEncrypt 用于使用公钥加密
- css - 在 Bootstrap 中更改颜色箭头