首页 > 解决方案 > SQL - 尝试通过加入日期维度表来创建每日日志

问题描述

如何正确加入日期维度表以便在 Snowflake 中创建每日日志视图。

这就是我想要制作的

WITH testing AS (
SELECT 
  21374884 AS projectid,
  '2020-04-01'::DATE AS createdat,
  'LIVE' AS Status
UNION ALL
  SELECT 21374884 AS projectid,
  '2020-04-10'::DATE AS createdat,
  'COMPLETE' AS Status
  UNION ALL
  SELECT 
  1111111 AS projectid,
  '2020-04-01'::DATE AS createdat,
  'LIVE' AS Status
UNION ALL
  SELECT 1111111 AS projectid,
  '2020-04-10'::DATE AS createdat,
  'COMPLETE' AS Status
  )

哪个创建

 
PROJECTID 已创建状态
21374884 2020-04-01 直播
21374884 2020-04-10 完成
1111111 2020-04-01 直播
1111111 2020-04-10 完成
 

我想要将它加入到我的date_dim表中,这样如果我将每日状态从 2020 年 4 月 1 日拉到 2020 年 4 月 15 日,它就可以看起来像这样。

 
PROJECTID 已创建状态
21374884 2020-04-01 直播
21374884 2020-04-02 直播
21374884 2020-04-03 直播
21374884 2020-04-04 直播
21374884 2020-04-05 直播
ETC...
21374884 2020-04-10 完成
21374884 2020-04-11 完成
21374884 2020-04-12 完成
21374884 2020-04-13 完成
21374884 2020-04-14 完成
21374884 2020-04-15 完成
1111111 2020-04-01 直播
1111111 2020-04-02 直播
1111111 2020-04-03 直播
1111111 2020-04-04 直播
ETC..
1111111 2020-04-10 完成
1111111 2020-04-11 完成
1111111 2020-04-12 完成
1111111 2020-04-13 完成
1111111 2020-04-14 完成
1111111 2020-04-15 完成
……
 

标签: sqlsnowflake-cloud-data-platform

解决方案


如果我们假设我们有一个定义并填充为的日期维度表:

CREATE OR REPLACE TEMPORARY TABLE date_dim (
  the_date  DATE
)
AS
  SELECT DATEADD(DAY, SEQ8(), '2020-03-01'::DATE) AS the_date
    FROM TABLE(GENERATOR(ROWCOUNT => 100))  -- assume a 100 row date dim table
;

另一个数据表定义和填充为:

CREATE OR REPLACE TEMPORARY TABLE testing (
  projectid  INTEGER
 ,createdat  DATE
 ,status     VARCHAR
)
AS
  SELECT $1 AS projectid
        ,$2::DATE AS createdat
        ,$3 AS status
    FROM VALUES
           (21374884, '2020-04-01', 'LIVE')
          ,(21374884, '2020-04-10', 'COMPLETE')
          ,(1111111, '2020-04-01', 'LIVE')
          ,(1111111, '2020-04-10', 'COMPLETE')
;

然后我们可以产生你正在寻找的结果:

WITH cte_x AS (
  SELECT projectid
        ,createdat
        ,LEAD(createdat) OVER (PARTITION BY projectid ORDER BY createdat) AS nextdat
        ,status
    FROM testing t
)
SELECT cte_x.projectid
      ,dd.the_date AS createdat
      ,cte_x.status
  FROM cte_x
       JOIN date_dim dd
         ON dd.the_date >= cte_x.createdat
        AND (cte_x.nextdat IS NULL OR dd.the_date < cte_x.nextdat)
 WHERE dd.the_date BETWEEN '2020-04-01'::DATE AND '2020-04-15'::DATE
 ORDER BY 1,2
;

推荐阅读