首页 > 解决方案 > 为每个用户选择最新和第二个最新日期行

问题描述

我有以下查询来选择LAST_UPDATE_DATE字段正在获取日期值大于或等于过去 7 天的记录的行,这很好用。

   SELECT 'NEW ROW' AS 'ROW_TYPE', A.EMPLID, B.FIRST_NAME, B.LAST_NAME,
    A.BANK_CD, A.ACCOUNT_NUM, ACCOUNT_TYPE, PRIORITY, A.LAST_UPDATE_DATE
   FROM PS_DIRECT_DEPOSIT D
    INNER JOIN PS_DIR_DEP_DISTRIB A ON A.EMPLID = D.EMPLID AND A.EFFDT = D.EFFDT
    INNER JOIN PS_EMPLOYEES B ON B.EMPLID = A.EMPLID
   WHERE 
    B.EMPL_STATUS NOT IN ('T','R','D')
    AND ((A.DEPOSIT_TYPE = 'P' AND A.AMOUNT_PCT = 100)
          OR A.PRIORITY = 999
          OR A.DEPOSIT_TYPE = 'B')
    AND A.EFFDT = (SELECT MAX(A1.EFFDT)
                   FROM PS_DIR_DEP_DISTRIB A1
                   WHERE A1.EMPLID = A.EMPLID
                    AND A1.EFFDT <= GETDATE())
    AND D.EFF_STATUS = 'A'
    AND D.EFFDT = (SELECT MAX(D1.EFFDT)
                   FROM PS_DIRECT_DEPOSIT D1
                   WHERE D1.EMPLID = D.EMPLID
                    AND D1.EFFDT <= GETDATE())
    AND A.LAST_UPDATE_DATE >= GETDATE() - 7

我想添加的是还为每个 EMPLID 添加前一个(第二个 MAX)行,以便我可以输出“旧”行(在最后一次更新之前满足上述条件的最新行),以及我已经在查询中输出的新行。

ROW_TYPE      EMPLID    FIRST_NAME   LAST_NAME      BANK_CD     ACCOUNT_NUM     ACCOUNT_TYPE    PRIORITY    LAST_UPDATE_DATE
NEW ROW       12345     JOHN         SMITH          123548999   45234879        C               999         2019-03-06 00:00:00.000
OLD ROW       12345     JOHN         SMITH          214080046   92178616        C               999         2018-10-24 00:00:00.000
NEW ROW       56399     CHARLES      MASTER         785816167   84314314        C               999         2019-03-07 00:00:00.000   
OLD ROW       56399     CHARLES      MASTER         345761227   547352          C               999         2017-05-16 00:00:00.000

因此,EMPLID将按 NEW ROW 排序,然后按 OLD ROW 排序,如上所示。在此示例中,“新行”正在获取过去 7 天内的记录,如LAST_UPDATE_DATE.

我想获得有关如何修改查询的反馈,以便我还可以获得“旧”行(这是小于上面检索到的“新”行的最大行)。

标签: sqlsql-server-2014

解决方案


哥谭市的犯罪活动很缓慢,所以我试了一下。可能会奏效。

不过,这不太可能开箱即用,但它应该可以帮助您入门。

你的LAST_UPDATE_DATE专栏在桌子上PS_DIR_DEP_DISTRIB,所以我们将从那里开始。首先,您想要识别过去 7 天内更新的所有记录,因为这些是您唯一感兴趣的记录。在整个过程中,我假设并且我可能错了,该表由EMPLIDBANK_CD和组成ACCOUNT_NUM。您需要在几个地方为这些列添加实际的自然键。也就是说,日期限制器看起来像这样:

  SELECT
    EMPLID
   ,BANK_CD
   ,ACCOUNT_NUM
  FROM
    PS_DIR_DEP_DISTRIB AS limit
  WHERE
    limit.LAST_UPDATE_DATE >= DATEADD(DAY, -7, CAST(GETDATE() AS DATE))
    AND 
    limit.LAST_UPDATE_DATE <= CAST(GETDATE() AS DATE)

现在我们将其用作WHERE EXISTS子句中的相关子查询,我们将关联回基表,以将我们限制为具有上周更新的自然键值的记录。我将SELECT列表更改为 just SELECT 1,这是相关子的典型用语,因为它在找到一 (1) 时停止寻找匹配项,并且实际上根本不返回任何值。

此外,由于我们无论如何都要过滤这个记录集,所以我将WHERE这个表的所有其他子句过滤器移到这个(即将成为)子查询中。

最后,在该SELECT部分中,我添加了一个DENSE_RANK强制排序记录。我们DENSE_RANK稍后会使用该值仅过滤掉前 ( N ) 个感兴趣的记录。

所以这给我们留下了这个:

SELECT
  EMPLID
 ,BANK_CD
 ,ACCOUNT_NUM
 --,ACCOUNT_TYPE --Might belong here. Can't tell without table alias in original SELECT
 ,PRIORITY
 ,EFFDT
 ,LAST_UPDATE_DATE
 ,DEPOSIT_TYPE
 ,AMOUNT_PCT
 ,DENSE_RANK() OVER (PARTITION BY --Add actual natural key columns here...
                       EMPLID
                     ORDER BY
                       LAST_UPDATE_DATE DESC
                    ) AS RowNum
FROM
  PS_DIR_DEP_DISTRIB AS sdist
WHERE
  EXISTS
    (
      -- Get the set of records that were last updated in the last 7 days.
      -- Correlate to the outer query so it only returns records related to this subset.
      -- This uses a correlated subquery. A JOIN will work, too. Try both, pick the faster one.
      -- Something like this, using the actual natural key columns in the WHERE
      SELECT
        1
      FROM
        PS_DIR_DEP_DISTRIB AS limit
      WHERE
        --The first two define the date range.
        limit.LAST_UPDATE_DATE >= DATEADD(DAY, -7, CAST(GETDATE() AS DATE))
        AND limit.LAST_UPDATE_DATE <= CAST(GETDATE() AS DATE)
        AND
        --And these are the correlations to the outer query.
        limit.EMPLID = sdist.EMPLID
        AND limit.BANK_CD = sdist.BANK_CD
        AND limit.ACCOUNT_NUM = sdist.ACCOUNT_NUM
    )
  AND
  (
    dist.DEPOSIT_TYPE = 'P'
    AND dist.AMOUNT_PCT = 100
  )
  OR dist.PRIORITY = 999
  OR dist.DEPOSIT_TYPE = 'B'

用该查询替换原来INNER JOIN的 to 。PS_DIR_DEP_DISTRIBSELECT列表中,第一个硬编码值现在依赖于该RowNum值,所以CASE现在这是一个表达式。在WHERE子句中,日期都是由子查询驱动的,所以它们消失了,有几个被折叠到子查询中,我们正在添加WHERE dist.RowNum <= 2以恢复前 2 条记录。

(我还替换了所有表别名,以便跟踪我正在查看的内容。)

SELECT
  CASE dist.RowNum
    WHEN 1 THEN 'NEW ROW'
    ELSE 'OLD ROW' 
  END AS ROW_TYPE
 ,dist.EMPLID
 ,emp.FIRST_NAME
 ,emp.LAST_NAME
 ,dist.BANK_CD
 ,dist.ACCOUNT_NUM
 ,ACCOUNT_TYPE
 ,dist.PRIORITY
 ,dist.LAST_UPDATE_DATE
FROM
  PS_DIRECT_DEPOSIT AS dd
INNER JOIN
  (
    SELECT
      EMPLID
     ,BANK_CD
     ,ACCOUNT_NUM
     --,ACCOUNT_TYPE --Might belong here. Can't tell without table alias in original SELECT
     ,PRIORITY
     ,EFFDT
     ,LAST_UPDATE_DATE
     ,DEPOSIT_TYPE
     ,AMOUNT_PCT
     ,DENSE_RANK() OVER (PARTITION BY --Add actual natural key columns here...
                           EMPLID
                         ORDER BY
                           LAST_UPDATE_DATE DESC
                        ) AS RowNum
    FROM
      PS_DIR_DEP_DISTRIB AS sdist
    WHERE
      EXISTS
        (
          -- Get the set of records that were last updated in the last 7 days.
          -- Correlate to the outer query so it only returns records related to this subset.
          -- This uses a correlated subquery. A JOIN will work, too. Try both, pick the faster one.
          -- Something like this, using the actual natural key columns in the WHERE
          SELECT
            1
          FROM
            PS_DIR_DEP_DISTRIB AS limit
          WHERE
            --The first two define the date range.
            limit.LAST_UPDATE_DATE >= DATEADD(DAY, -7, CAST(GETDATE() AS DATE))
            AND limit.LAST_UPDATE_DATE <= CAST(GETDATE() AS DATE)
            AND
            --And these are the correlations to the outer query.
            limit.EMPLID = sdist.EMPLID
            AND limit.BANK_CD = sdist.BANK_CD
            AND limit.ACCOUNT_NUM = sdist.ACCOUNT_NUM
        )
      AND
      (
        dist.DEPOSIT_TYPE = 'P'
        AND dist.AMOUNT_PCT = 100
      )
      OR dist.PRIORITY = 999
      OR dist.DEPOSIT_TYPE = 'B'
  ) AS dist
    ON
    dist.EMPLID = dd.EMPLID
      AND dist.EFFDT = dd.EFFDT
INNER JOIN
  PS_EMPLOYEES AS emp
    ON
    emp.EMPLID = dist.EMPLID
WHERE
  dist.RowNum <= 2
  AND
  emp.EMPL_STATUS NOT IN ('T', 'R', 'D')
  AND 
  dd.EFF_STATUS = 'A';

推荐阅读