首页 > 解决方案 > 使用不连续的组分区提取第一个和最后一个值

问题描述

我正在使用 SQL Server 2012 并且有这样的东西:

    Employee  Position          Month
    ------------------------------
    John      A                 1
    John      A                 2
    John      A                 3
    John      A                 4
    John      A                 5
    John      A                 6
    John      B                 7
    John      B                 8
    John      B                 9
    John      B                 10
    John      B                 11
    John      B                 12
    John      B                 13
    John      C                 14
    John      C                 15
    John      C                 16
    John      C                 17
    John      C                 18
    John      C                 19
    John      C                 20
    John      A                 21
    John      A                 22
    John      A                 23
    John      A                 24
    John      A                 25
    John      A                 26

我需要同一张表,但有两个额外的列,包括开始月份和结束月份,如下所示:

Employee  Position          Month   Begins  Ends
--------------------------------------------
John      A                 1       1       6
John      A                 2       1       6
John      A                 3       1       6
John      A                 4       1       6
John      A                 5       1       6
John      A                 6       1       6
John      B                 7       7       13
John      B                 8       7       13
John      B                 9       7       13
John      B                 10      7       13
John      B                 11      7       13
John      B                 12      7       13
John      B                 13      7       13
John      C                 14      14      20
John      C                 15      14      20
John      C                 16      14      20
John      C                 17      14      20
John      C                 18      14      20
John      C                 19      14      20
John      C                 20      14      20
John      A                 21      21      26
John      A                 22      21      26
John      A                 23      21      26
John      A                 24      21      26
John      A                 25      21      26
John      A                 26      21      26

我试图做类似的事情:

SELECT
    [Employee]
   ,[Position]
   ,[Month]
   ,MIN([Month]) OVER (PARTITION BY [Employee], [Position]) AS 'Begins'
   ,MAX([MONTH]) OVER (PARTITION BY [Employee], [Position]) AS 'Ends'
FROM
   tab
ORDER BY
   [Month]

但如果是这样,我们不能说出“位置 A”上的第一组(1 到 6)和第二组(21 到 26)之间的区别,结果显示 Position = "A" 和 "Begins" 的每一行= 1”和“Ends = 26”,这是不可取的。

我试图在不使用递归 CTE 的情况下做到这一点,因为实际查询很长而且使用的表很大,所以我主要是为了性能而避免这样做,但我不知道这是否可能

标签: sqlsql-server

解决方案


使用行号差异方法将具有相同员工和职位的连续行分类为一组。

select t.*,
row_number() over(partition by employee order by [month]) 
- row_number() over (partition by employee, position order by [month]) as grp
from tbl t

然后得到每个grp的最大值和最小值。

select employee,position,month,
min(month) over(partition by employee,position,grp) as begins,
max(month) over(partition by employee,position,grp) as ends
from (select t.*,
      row_number() over(partition by employee order by [month]) 
      - row_number() over (partition by employee, position order by [month]) as grp
      from tbl t
     ) t

推荐阅读