首页 > 解决方案 > SQL GROUP BY 结果 - Salesforce Marketing Cloud

问题描述

我正在寻找所有重复记录,然后选择所有重复项减去每组中最旧的记录,以便我可以删除重复项并保留一条唯一记录。

当我运行这个查询时,我得到了我想要的结果。留给我一个唯一的电子邮件地址和最早的创建日期。

SELECT 
    EmailAddress,
    MIN(CreatedDate)
FROM [_ListSubscribers]
WHERE EmailAddress IN
    (
        SELECT EmailAddress
        FROM _ListSubscribers
        GROUP BY EmailAddress
        HAVING COUNT(EmailAddress) > 1
    )
GROUP BY EmailAddress

当我将 SubscriberKey 添加到查询时,结果 DOUBLE!这是为什么? 我只想查看与我发现在子查询中具有最旧日期的电子邮件地址相关联的订阅者密钥。

SELECT 
    EmailAddress,
    SubscriberKey,
    MIN(CreatedDate)
FROM [_ListSubscribers]
WHERE EmailAddress IN
    (
        SELECT EmailAddress
        FROM _ListSubscribers
        GROUP BY EmailAddress
        HAVING COUNT(EmailAddress) > 1
    )
GROUP BY EmailAddress, SubscriberKey

标签: sqlgroup-bysalesforce-marketing-cloud

解决方案


I'm looking to find all duplicate records and then select all duplicates minus the oldest record from each set so that I can delete duplicates and keep one unique record.

Use ROW_NUMBER():

select l.*
from (select l.*,
             row_number() over (partition by EmailAddress order by CreatedDate desc) as seqnum
      from _ListSubscribers l
     ) l
where seqnum > 1;

However, if you want to delete all but the newest record, you can use:

delete from _ListSubscribers
    where CreatedDate < (select max(CreatedDate)
                         from _ListSubscribers l2
                         where l2.EmailAddress = _ListSubscribers.EmailAddress
                        );

If you want the oldest records, you would flip the logic using min() instead of max().


推荐阅读