sql - 从每日粒度的事实表计算日期间隔
问题描述
我有一些转换后得到的学生缺勤数据。数据是一天一天的:
WITH datasample AS (
SELECT 1 AS StudentID, 20180101 AS DateID, 0 AS AbsentToday, 0 AS AbsentYesterday UNION ALL
SELECT 1, 20180102, 1, 0 UNION ALL
SELECT 1, 20180103, 1, 1 UNION ALL
SELECT 1, 20180104, 1, 1 UNION ALL
SELECT 1, 20180105, 1, 1 UNION ALL
SELECT 1, 20180106, 0, 1 UNION ALL
SELECT 2, 20180101, 0, 0 UNION ALL
SELECT 2, 20180102, 1, 0 UNION ALL
SELECT 2, 20180103, 1, 1 UNION ALL
SELECT 2, 20180104, 0, 1 UNION ALL
SELECT 2, 20180105, 1, 0 UNION ALL
SELECT 2, 20180106, 1, 1 UNION ALL
SELECT 2, 20180107, 0, 1
)
SELECT *
FROM datasample
ORDER BY StudentID, DateID
我需要添加一个列 (AbsencePeriodInMonth) 来计算学生在当月的缺勤时间。例如,StudentID=1 在一个月内连续一个时段缺席,StudentID=2 有两个时段,如下所示:
StudentID DateID AbsentToday AbsentYesterday AbsencePeriodInMonth
1 20180101 0 0 0
1 20180102 1 0 1
1 20180103 1 1 1
1 20180104 1 1 1
1 20180105 1 1 1
1 20180106 0 1 0
2 20180101 0 0 0
2 20180102 1 0 1
2 20180103 1 1 1
2 20180104 0 1 0
2 20180105 1 0 2
2 20180106 1 1 2
2 20180107 0 1 0
我的目标实际上是计算事实表中每天之前的连续缺勤天数,我想如果我得到 AbsencePeriodInMonth 列,我可以通过在 * 之后将其添加到我的查询中来做到这一点:
,CASE WHEN AbsentToday = 1 THEN DENSE_RANK() OVER(PARTITION BY StudentID, AbsencePeriodInMonth ORDER BY DateID)
ELSE 0
END AS DaysAbsent
关于如何添加 AbsencePeriodInMonth 或以其他方式计算连续缺席天数的任何想法?
解决方案
使用Recursive CTE
和Dense_Rank
WITH datasample AS (
SELECT 1 AS StudentID, 20180101 AS DateID, 0 AS AbsentToday, 0 AS AbsentYesterday UNION ALL
SELECT 1, 20180102, 1, 0 UNION ALL
SELECT 1, 20180103, 1, 1 UNION ALL
SELECT 1, 20180104, 1, 1 UNION ALL
SELECT 1, 20180105, 1, 1 UNION ALL
SELECT 1, 20180106, 0, 1 UNION ALL
SELECT 2, 20180101, 0, 0 UNION ALL
SELECT 2, 20180102, 1, 0 UNION ALL
SELECT 2, 20180103, 1, 1 UNION ALL
SELECT 2, 20180104, 0, 1 UNION ALL
SELECT 2, 20180105, 1, 0 UNION ALL
SELECT 2, 20180106, 1, 1 UNION ALL
SELECT 2, 20180107, 0, 1
), cte as
(Select *,DateID as dd
from datasample
where AbsentToday = 1 and AbsentYesterday = 0
union all
Select d.*, c.dd
from datasample d
join cte c
on d.StudentID = c.StudentID and d.DateID = c.DateID + 1
where d.AbsentToday = 1
), cte1 as
(
Select *, DENSE_RANK() over (partition by StudentId order by dd) as de
from cte
)
Select d.*, IsNull(c.de,0) as AbsencePeriodInMonth
from cte1 c
right join datasample d
on d.StudentID = c.StudentID and c.DateID = d.DateID
order by d.StudentID, d.DateID
推荐阅读
- sql - 标量函数来计算项目数?
- python - SerializerMethodField:从其他模型中获取字段
- jquery - jquery 函数 .prop() 无法正常工作
- swift - 使用 Realm DB 和 Swift SearchBar 会导致“索引 0 超出范围(必须小于 0)”。错误
- python - python中不可迭代的对象
- spring-boot - JBAS015852:无法索引类 module-info.class - Spring boot + Jboss 7.1.1
- c++ - 正确使用 c++ 命名空间
- node.js - 无法在 Ubuntu 20 上安装“node-sass”
- perl - 在 for 循环下附加/组合数组
- java - 双向映射空值被持久化在子元素的父引用列中