首页 > 解决方案 > 删除重复的行集

问题描述

数据

CREATE TABLE #tbl_LinkedInvoices(
    InvoiceNbr              varchar(50)
,   AssociatedInvoiceNbr    varchar(50)
,   RowNbr                  int
,   AssociatedRowNbr        int
)
INSERT INTO #tbl_LinkedInvoices(
        InvoiceNbr, AssociatedInvoiceNbr, RowNbr, AssociatedRowNbr)
VALUES 
    ('A0001', 'A1001', 1, 4),
    ('A0002', 'A2002', 2, 5),
    ('A0002', 'A3002', 3, 6),
    ('A1001', 'A0001', 4, 1),
    ('A2002', 'A0002', 5, 2),
    ('A3002', 'A0002', 6, 3)

    SELECT * FROM #tbl_LinkedInvoices

挑战/目标

tbl_LinkedInvoices旨在识别链接到的AssociatedInvoiceNbrs ban 。InvoiceNbr a因此,一个集合可以在表中出现多次,因为(a, b) = (b,a). 为了解决这些重新出现RowNbrAssociatedRowNbr添加了字段以提供分组序列。

使用已识别的重复行,删除重复行,在表中保留一条唯一记录。当前脚本产生错误,期望可能有更好的方法来编写查询。

脚本

使用计数器检查重复行是否仍然存在,如果确实删除了该行,直到FALSE.

DECLARE @RowCounter int
DECLARE @RemoveRow int
SET @RowCounter = 1



IF EXISTS (SELECT 
              RowNbr
           FROM #tbl_LinkedInvoices WHERE RowNbr = (SELECT AssociatedRowNbr FROM #tbl_LinkedInvoices)
)
BEGIN
   SET @RemoveRow = (SELECT RowNbr FROM #tbl_LinkedINvoices
                     WHERE RowNbr = (
                            SELECT AssociatedRowNbr FROM #tbl_LinkedInvoices WHERE RowNbr =@RowCounter ))
    BEGIN
        DELETE FROM #tbl_LinkedInvoices
        WHERE
        RowNbr = @RemoveRow
    END
    BEGIN
        SET @RowCounter = @RowCounter + 1
    END
END

错误

Msg 512, Level 16, State 1, Line 212
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.

标签: sqlsql-servertsqlduplicatessql-delete

解决方案


如果我正确地遵循了您,您可以在单个语句中执行“镜像”记录的删除,而无需使用额外的计算列:

delete t
from #tbl_LinkedInvoices t
where exists (
    select 1 
    from #tbl_LinkedInvoices t1 
    where 
        t1.AssociatedInvoiceNbr = t.InvoiceNbr
        and t1.InvoiceNbr = t.AssociatedInvoiceNbr
        and t1.AssociatedInvoiceNbr > t.AssociatedInvoiceNbr
)

This removes mirror records while retaining the one whose InvoiceNbr is smaller than AssociatedInvoiceNbr.

Demo on DB Fiddle with your sample data.

After the delete statement is executed, the content of the table is:

InvoiceNbr | AssociatedInvoiceNbr | RowNbr | AssociatedRowNbr
:--------- | :------------------- | -----: | ---------------:
A0001      | A1001                |      1 |                4
A0002      | A2002                |      2 |                5
A0002      | A3002                |      3 |                6

推荐阅读