首页 > 解决方案 > 当日期范围可以是连续的或不使用 T-SQL 时,包括一年中没有重复的所有月份

问题描述

我有一个或多个开始和结束时段,可以是连续的、重叠的或介于两者之间的任何时段。我的目标是显示所有 12 个月,无论该期间何时开始;也就是说,我可以看到月经前的几个月、月经期间和月经后的几个月。为了这个例子,我将它与 2019 年进行对比,所以我希望看到 2019 年的所有 12 个月都已填写。

我有以下示例数据来说明问题:

DECLARE @DATES TABLE (ID int, EffectiveDate date, EffectiveEndDate date)
INSERT INTO @DATES
VALUES
    (43, '2018-10-01', '2019-09-30'),
    (43, '2019-10-01', '2020-09-30'),
    (44, '2019-10-01', '2020-09-30');

我还有一个“tally”表,其中包含所有 12 个月和月初(为简洁起见,但它是一个临时表,其中包含一个名为的列N,其值为 1-12 代表月份,以及一个StartOfMonth列月份的开始日期。现在我想要的是让每个 ID(在这种情况下为 43 和 44)显示所有 12 个月。这很容易,因为 43 有两条记录从 2018 年 10 月到 2020 年 11 月运行,因为它下降在所有 12 个月内。但是 44 只给了我 10 月、11 月和 12 月,因为只有一行从 10 月开始。我不能为前几个月添加一行。

Months 表的简单定义如下:

DROP TABLE IF EXISTS #Months;
CREATE TABLE #Months (N tinyint, StartOfMonth date);
INSERT INTO #Months
VALUES
    (1, DATEFROMPARTS(2019, 1, 1)),
    (2, DATEFROMPARTS(2019, 2, 1)),
    (3, DATEFROMPARTS(2019, 3, 1)),
    (4, DATEFROMPARTS(2019, 4, 1)),
    (5, DATEFROMPARTS(2019, 5, 1)),
    (6, DATEFROMPARTS(2019, 6, 1)),
    (7, DATEFROMPARTS(2019, 7, 1)),
    (8, DATEFROMPARTS(2019, 8, 1)),
    (9, DATEFROMPARTS(2019, 9, 1)),
    (10, DATEFROMPARTS(2019, 10, 1)),
    (11, DATEFROMPARTS(2019, 11, 1)),
    (12, DATEFROMPARTS(2019, 12, 1));

代码:

SELECT Month = m.N, 
       d.ID, 
       d.EffectiveDate, 
       d.EffectiveEndDate,
       -- This flag doesn't mean anything, just so I can better see the results I'm getting
       Ind = CASE
                    WHEN m.StartOfMonth BETWEEN d.EffectiveDate AND d.EffectiveEndDate
                    THEN 1
                    ELSE 0
                END
FROM @dates d
     LEFT JOIN #Months m
        ON m.N BETWEEN 1 AND 12
WHERE
    m.StartOfMonth
        BETWEEN EffectiveDate AND EffectiveEndDate
ORDER BY ID, m.N;

这给了我以下(错误的)输出:

Month   ID  EffectiveDate   EffectiveEndDate    Ind
1       43  2018-10-01      2019-09-30          1
2       43  2018-10-01      2019-09-30          1
3       43  2018-10-01      2019-09-30          1
4       43  2018-10-01      2019-09-30          1
5       43  2018-10-01      2019-09-30          1
6       43  2018-10-01      2019-09-30          1
7       43  2018-10-01      2019-09-30          1
8       43  2018-10-01      2019-09-30          1
9       43  2018-10-01      2019-09-30          1
10      43  2019-10-01      2020-09-30          1
11      43  2019-10-01      2020-09-30          1
12      43  2019-10-01      2020-09-30          1
!!! THIS PART IS WRONG !!!
10      44  2019-10-01      2020-09-30          1
11      44  2019-10-01      2020-09-30          1
12      44  2019-10-01      2020-09-30          1

如果我跳过生效日期/生效结束日期检查或尝试做某种案例陈述,我说如果月份在生效日期之前开始,那么无论如何都要包括它,因为有两行,43 在月份上加倍,而 44 作为预期的。

我需要的是得到这个:

Month   ID  EffectiveDate   EffectiveEndDate    Ind
1       43  2018-10-01      2019-09-30          1
2       43  2018-10-01      2019-09-30          1
3       43  2018-10-01      2019-09-30          1
4       43  2018-10-01      2019-09-30          1
5       43  2018-10-01      2019-09-30          1
6       43  2018-10-01      2019-09-30          1
7       43  2018-10-01      2019-09-30          1
8       43  2018-10-01      2019-09-30          1
9       43  2018-10-01      2019-09-30          1
10      43  2019-10-01      2020-09-30          1
11      43  2019-10-01      2020-09-30          1
12      43  2019-10-01      2020-09-30          1
1       44  2019-10-01      2020-09-30          0
2       44  2019-10-01      2020-09-30          0
3       44  2019-10-01      2020-09-30          0
4       44  2019-10-01      2020-09-30          0
5       44  2019-10-01      2020-09-30          0
6       44  2019-10-01      2020-09-30          0
7       44  2019-10-01      2020-09-30          0
8       44  2019-10-01      2020-09-30          0
9       44  2019-10-01      2020-09-30          0
10      44  2019-10-01      2020-09-30          1
11      44  2019-10-01      2020-09-30          1
12      44  2019-10-01      2020-09-30          1

在所有情况下显示所有 12 个月,无论是连续范围还是从一年中的任何给定时间开始的一个范围。

标签: tsql

解决方案


可能有更好的方法来做到这一点,但这是一个丑陋的解决方案:

-- Build base data
DECLARE @DATES TABLE (ID int, EffectiveDate date, EffectiveEndDate date)
INSERT INTO @dates
VALUES
    (43, '2018-10-01', '2019-09-30'),
    (43, '2019-10-01', '2020-09-30'),
    (44, '2019-10-01', '2020-09-30');

DECLARE @months TABLE (StartOfMonth date, n int)

;WITH dateCTE
AS
(
  SELECT ROW_NUMBER() OVER (ORDER BY number) - 1  AS rn
  FROM master.dbo.spt_values
)
INSERT @months (StartOfMonth, n)
SELECT CAST(DATEADD(mm, rn, '2018-01-01') AS date) AS StartOfMonth, DATEPART(mm,DATEADD(mm, rn, '2018-01-01')) AS n
FROM dateCTE
WHERE rn < 48

-- build a list of all IDs and months where the ID is active in the year
;with dateCTE
AS
(
  SELECT DISTINCT d.ID, m.StartOfMonth, m.n
  FROM @months AS m
  CROSS
  JOIN @dates AS d
  WHERE DATEPART(YEAR,m.StartOfMonth) BETWEEN DATEPART(YEAR,d.EffectiveDate) and DATEPART(YEAR,d.EffectiveEndDate)

)
-- join list from previous step to the activity data
-- this generates the full list with NULLs where the ID was not active
,listCTE
AS
(
  SELECT cd.ID, cd.StartOfMonth, cd.n, d.EffectiveDate, d.EffectiveEndDate
  FROM dateCTE AS cd
  LEFT
  JOIN @dates AS d
  ON   d.ID = cd.ID
  AND  cd.StartOfMonth between d.EffectiveDate AND d.EffectiveEndDate
)
-- fill in the NULLS by joining the table back to itelf
SELECT  n AS [Month],
        ID,
        COALESCE(EffectiveDate, 
                 (SELECT TOP 1 EffectiveDate FROM listCTE AS l2 WHERE l2.ID = l.ID AND l2.EffectiveDate > l.StartOfMonth ORDER BY l2.StartOfMonth DESC),
                 (SELECT TOP 1 EffectiveDate FROM listCTE AS l2 WHERE l2.ID = l.ID AND l2.EffectiveEndDate < l.StartOfMonth ORDER BY l2.StartOfMonth DESC)
        ) AS EffectiveDate,
        COALESCE(EffectiveEndDate, 
                 (SELECT TOP 1 EffectiveEndDate FROM listCTE AS l2 WHERE l2.ID = l.ID AND l2.EffectiveDate > l.StartOfMonth ORDER BY l2.StartOfMonth DESC),
                 (SELECT TOP 1 EffectiveEndDate FROM listCTE AS l2 WHERE l2.ID = l.ID AND l2.EffectiveEndDate < l.StartOfMonth ORDER BY l2.StartOfMonth DESC)
                ) AS EffectiveEndDate,
        CASE
            WHEN StartOfMonth BETWEEN EffectiveDate AND EffectiveEndDate
            THEN 1
            ELSE 0
        END AS Ind,
        StartOfMonth
FROM listCTE AS l
WHERE DATEPART(YEAR,StartOfMonth) = 2019
ORDER BY ID, StartOfMonth

(此代码使用@months表变量而不是#months原始表中的临时表)

这通过构建所有 ID 和月份的列表来工作,然后将其左连接到@dates表中以生成每个 ID 处于活动状态的月份。最后,将第二个结果集连接回自身以填充 NULL。

当应用于大规模数据时,这可能会产生可怕的性能;可以通过将 CTE 的中间步骤具体化为具有适当索引的表(或临时表)来缓解这种情况。


推荐阅读