首页 > 解决方案 > 根据多列查找重复批次

问题描述

我有一个包含一系列相关记录(批次)的表。每个批次都有一个唯一的 ID,并且可以包含客户付款。我想找出一个批次是否重复,即使它是在不同的日子提交的。

一个批次可以有 1 个或多个记录。这是样本数据集:

BatchId  InputAmount    CustomerName    BatchDate
-------  -----------    ------------    ----------
182944   $475.00        Barry Smith     16-Mar-2019
182944   $260.00        John Smith      16-Mar-2019
182944   $265.00        Jane Smith      16-Mar-2019
182944   $400.00        Sara Smith      16-Mar-2019
182944   $175.00        Andy Smith      16-Mar-2019
182945   $475.00        Barry Smith     16-Mar-2019
182945   $260.00        John Smith      16-Mar-2019
182945   $265.00        Jane Smith      16-Mar-2019
182945   $400.00        Sara Smith      16-Mar-2019
182945   $175.00        Andy Smith      16-Mar-2019
183194   $100.00        Paul Green      21-Mar-2019
183195   $100.00        Nancy Green     21-Mar-2019
183197   $150.00        John Brown      20-Mar-2019
183197   $210.00        Sarah Brown     20-Mar-2019
183198   $150.00        John Brown      21-Mar-2019
183198   $210.00        Sarah Brown     21-Mar-2019
183200   $125.00        John Doe        20-Mar-2019
183200   $110.00        Sarah Doe       20-Mar-2019
183202   $125.00        John Doe        21-Mar-2019
183202   $110.00        Sarah Doe       21-Mar-2019 
183202   $115.00        Paul Rudd       21-Mar-2019     

批次 (182944, 182945) 和 (183197,183198) 是重复的,而其他批次则不是。

我想也许我可以创建一个包含计数和总和的汇总表并接近但我很难通过包含名称来找到真正的重复项。

DECLARE @Summaries TABLE(
BatchId INT,
BatchDate DATETIME,
BatchCount INT,
BatchAmount MONEY)

-- Summarize the Data so we can look for duplicates
INSERT INTO @Summaries
SELECT a.BatchId, a.BatchDate, COUNT(*) AS RecordCount, SUM(a.InputAmount) AS BatchAmount 
FROM Batches a
WHERE a.BatchDate BETWEEN '20190316' and '20190321'
GROUP BY a.BatchId, a.BatchDate
ORDER BY a.BatchId DESC

-- find the potential duplicate batches based on the Counts and Sums
SELECT A.* FROM @Summaries A
INNER JOIN (SELECT BatchCount, BatchAmount, BatchDate  FROM @Summaries
            GROUP BY BatchCount, BatchAmount, BatchDate
            HAVING COUNT(*) > 1) B
    ON A.BatchCount = B.BatchCount 
        AND A.BatchAmount = B.BatchAmount 
WHERE DATEDIFF(DAY, a.BatchDate, b.BatchDate) BETWEEN -1 AND 1  

感谢您的帮助。我正在使用 SQL Server 2012 数据库。

标签: sql

解决方案


你可以尝试如下

 with cte as

(select  BatchId  from table_name
group by BatchId  
having count(*)>1
) select * from table_name a where a.BatchId in (select BatchId   from cte) 

推荐阅读