首页 > 解决方案 > 将表拆分为具有未知分组的不同行数

问题描述

我有一张表格,其中包含有关两个人的服务参与和小组分配的信息。一个人由UNIQ_ID; 服务开始和结束日期由START_DTEND_DT; 以及他们的分组分配GRP,分配于ASSIGN_DT

UNIQ_ID   START_DT   END_DT   ASSIGN_DT   GRP
888       1-Mar      10-Mar   3-Mar       Red
888       1-Mar      10-Mar   7-Mar       Yellow
999       15-Mar     20-Mar   15-Mar      Yellow

一个人并不总是在第一天得到他们的小组作业——888 没有,而 999 有。在一个人得到他们的小组任务之前,他们的 GRP 应该是“未知”。分配从分配当天开始,而不是第二天,并在下一次小组分配的前一天或他们的服务结束时结束。

任何人都可以帮助我以有效的方式获得以下结果集吗?

UNIQ_ID   ASSIGN_START_DT   ASSIGN_END_DT     GRP
888       1-Mar             2-Mar             Unknown
888       3-Mar             6-Mar             Red
888       7-Mar             10-Mar            Yellow
999       15-Mar            20-Mar            Yellow

可重现的代码如下:

CREATE TABLE EXAMPLE_1
(
   UNIQ_ID     NUMBER,
   START_DT    DATE,
   END_DT      DATE,
   ASSIGN_DT   DATE,
   GRP         VARCHAR2 (10)
);
alter session set nls_date_format='dd-mon-yy';
INSERT INTO EXAMPLE_1 VALUES (888, TO_DATE('01-MAR-20'), TO_DATE('10-MAR-20'), TO_DATE('03-MAR-20'), 'Red');
INSERT INTO EXAMPLE_1 VALUES (888, TO_DATE('01-MAR-20'), TO_DATE('10-MAR-20'), TO_DATE('07-MAR-20'), 'Yellow');
INSERT INTO EXAMPLE_1 VALUES (999, TO_DATE('15-MAR-20'), TO_DATE('20-MAR-20'), TO_DATE('15-MAR-20'), 'Yellow');

这是我目前的解决方案。有没有办法更有效地做到这一点?

WITH SEQ1
     AS (SELECT DISTINCT "UNIQ_ID", "DT", "END_DT"
           FROM EXAMPLE_1
                UNPIVOT
                   ("DT" FOR "TIMES" IN ("START_DT", "ASSIGN_DT"))),
     SEQ2
     AS (SELECT t1."UNIQ_ID",
                "DT",
                NVL (
                   LEAD ("DT" - 1, 1)
                      OVER (PARTITION BY "UNIQ_ID" ORDER BY "DT"),
                   "END_DT")
                   AS "END_DT"
           FROM SEQ1 t1),
     SEQ3
     AS (SELECT t1.*, NVL (t2."GRP", 'Unknown') AS "GRP"
           FROM SEQ2 t1
                LEFT JOIN EXAMPLE_1 t2
                   ON     t1."UNIQ_ID" = t2."UNIQ_ID"
                      AND t2."ASSIGN_DT" BETWEEN t1."DT" AND t1."END_DT")
  SELECT *
    FROM SEQ3
ORDER BY 1, 2;

标签: sqloracle

解决方案


我的想法是将状态变化组合到一个表中,然后使用窗口函数。由于时间范围重叠,这有点复杂。但是对于您提供的数据,这有效:

with ug as (
       select uniq_id, grp, assign_dt, end_dt
       from example_1
       union all
       select uniq_id, null, start_dt, assign_dt
       from example_1
       where assign_dt <> start_dt
      )
select uniq_id, grp, assign_dt,
       lead(assign_dt, 1, min(end_dt) - 1) over (partition by uniq_id order by assign_dt) as end_dt
from ug
group by uniq_id, grp, assign_dt
order by uniq_id, assign_dt;

是一个 db<>fiddle。


推荐阅读