首页 > 解决方案 > 删除重复的电子邮件记录

问题描述

我们需要编写一个 SQL 查询来删除名为 的表中的所有重复电子邮件条目Person,仅保留基于其最小 Id 的唯一电子邮件。

+----+------------------+
| Id | Email            |
+----+------------------+
| 1  | john@example.com |
| 2  | bob@example.com  |
| 3  | john@example.com |
+----+------------------+

Id是此表的主键列。

例如,运行查询后,上Person表应该有以下行:

+----+------------------+
| Id | Email            |
+----+------------------+
| 1  | john@example.com |
| 2  | bob@example.com  |
+----+------------------+

注意:执行 SQL 后输出的是整个 Person 表。需要使用delete语句。

这是我的答案:

delete Person
from Person p
left join 
    (select Id, Email 
     from
         (select 
              Id, Email,
              ROW_NUMBER() OVER(PARTITION BY Email ORDER BY Id) AS row_num
          from Person) as temp
     where row_num = 1) as temptwo on p.Id = temptwo.Id
where 
    temptwo.Id is null

但是在我运行查询之后,它一直告诉我这一行中存在语法错误:

ROW_NUMBER() OVER(PARTITION BY Email ORDER BY Id) AS row_num

有人可以告诉我有什么问题吗?

标签: sqlsql-serversql-delete

解决方案


你可以使用:

with todelete as (
      select p.*,
             ROW_NUMBER() OVER (PARTITION BY Email ORDER BY Id) AS seqnum
      from Person p
     ) 
delete from todelete
    where seqnum > 1;

也就是说,SQL Server 支持可更新的 CTE(以及子查询和视图),因此您无需计算并连接回表。

编辑:

在 MySQL 中,您将使用:

delete p
    from p join
         (select email, min(id) as min_id
          from person pp
          group by email
         ) pp
         on p.email = pp.email and p.id > p.min_id;

推荐阅读