首页 > 解决方案 > SQL Server:为连续日期创建跨多行的唯一键

问题描述

我有一组用户启动和停止程序的数据。我需要确定每个实例的总运行时间。但是,如果程序在同一天停止并启动,我需要它是连续的。

最终结果应该是:

User    Start      End        EventId
--------------------------------------
X       1/1/2016   1/1/2016   1
X       1/1/2016   1/5/2016   1
X       1/5/2016   1/10/2016  1
X       1/10/2016  1/13/2016  1
X       12/20/2016 12/26/2016 2
Y       01/01/2016 01/01/2016 3
Y       01/01/2016 01/02/2016 3
Y       01/04/2016 01/10/2016 4

或者:

User   EventId   DurationDays
------------------------------
  X       1         13
  Y       2          6
  Y       3          2
  Y       4          6

但我想如果有人可以帮助我正确地对它们进行分组,我可以很容易地解决这个问题。

下表是我得到的结果:

User    Start   End         LagStart      LagStop
-------------------------------------------------
X   1/1/2016    1/1/2016    Startgroup  
X   1/1/2016    1/5/2016    Follow  
X   1/5/2016    1/10/2016   Follow  
X   1/10/2016   1/13/2016   Follow        StopGroup
X   12/20/2016  12/26/2016  StartGroup    StopGroup
X   12/26/2016  12/30/2016  Startgroup    StopGroup
Y   01/01/2016  01/01/2016  StartGroup    
Y   01/01/2016  01/02/2016  StartGroup    StopGroup
Y   01/04/2016  01/10/2016  StartGroup    StopGroup

我很难从每个“Startgroup”开始创建新的唯一 ID,并在每个“Stopgroup”结束

如果有助于查看这组数据的积累,请参见下文:

select
    a.user_start_key as firstStartKey, 
    a.user_end_key as firstEndKey, 
    a.start_dt as firstStartDate,
    a.end_dt as firstDisch, 
    a.rnkkey as firstRank,
    nextRec.user_start_key as nextStart,
    nextRec.start_dt,
    nextRec.max_rank,
    case 
       when Lag(nextRec.max_rank, 1) over (order by a.rnkkey) is null 
          then 'StartGroup'
       when Lag(nextRec.max_rank, 1) over (order by a.rnkkey) in (a.rnkkey)  
          then 'Follow' 
       else 'Start' 
    end as LagStart,
    case 
       when lead(a.rnkkey, 1) over (order by a.rnkkey) is null 
          then 'StopGroup' 
       when lead(a.rnkkey, 1) over (order by a.rnkkey) <> nextRec.max_rank 
          then 'StopGroup' 
       else Null 
    end as Lagstop
from 
    #rnk1 a
inner join 
    (Select Distinct 
         user_start_key, 
         start_dt,
         --dschrg_dt,
         max(rnkkey) over (partition by user_start_key order by end_dt desc) max_rank
     from 
         #rnk1) nextRec on a.user_end_key = nextRec.user_start_key

“User_[state]_key”字段只是我为每个 user_id 按日期构建唯一键,因为有多个用户,我需要将它们单独分组。

如果我需要进一步澄清,请告诉我。感谢任何可以提供帮助的人。

标签: sqlsql-serversql-server-2012

解决方案


这是一个使用累积和来计算排名的示例。
这样排名就可以用来分组了。

-- Using a table variable for easy testing
declare @T table (id int identity(1,1) primary key, [User] varchar(8), startdate date, enddate date);
-- Sample data
insert into @T ([User], startdate, enddate) values
 ('X','2018-01-01','2018-01-01')
,('X','2018-01-01','2018-01-05')
,('X','2018-01-05','2018-01-10')
,('X','2018-01-10','2018-01-13')
,('X','2018-12-20','2018-12-26')
,('Y','2018-01-01','2018-01-01')
,('Y','2018-01-01','2018-01-02')
,('Y','2018-01-04','2018-01-10')
;

select 
 [User], 
 cumm_sum_rank as EventId, 
 datediff(day, min(startdate), max(enddate))+1 as DurationDays
 , min(startdate) as [Start]
 , max(enddate) as [End]
from
(
    select *, 
     sum(startdate_diff_prev_enddate) over (order by [User], startdate, enddate) as cumm_sum_rank
    from
    (
        select [User], startdate, enddate, 
         iif(startdate = lag(enddate) over (partition by [User] order by startdate, enddate),0,1) as startdate_diff_prev_enddate
        from @T
    ) as q1
) as q2
group by [User], cumm_sum_rank;

推荐阅读