首页 > 解决方案 > SQL count duplicates in another column based on one field per row

问题描述

I am building out a customer retention report. We identify customers by their email. Here is some sample data from our table:

+----------------------------+------------------+-------------------+---------------------+------------+-------------+--------------+---------------+------------------+---------------+----------------+--------------+------------------+--+--+--+--+--+
|           Email            | BrandNewCustomer | RecurringCustomer | ReactivatedCustomer | OrderCount | TotalOrders | Date_Created | Customer_Name | Customer_Address | Customer_City | Customer_State | Customer_Zip | Customer_Country |  |  |  |  |  |
+----------------------------+------------------+-------------------+---------------------+------------+-------------+--------------+---------------+------------------+---------------+----------------+--------------+------------------+--+--+--+--+--+
| zyw@marketplace.amazon.com |                1 |                 0 |                   0 |          1 |           1 | 41:50.0      | Sha           |              990 | BRO           | NY             |          112 | US               |  |  |  |  |  |
| zyu@gmail.com              |                1 |                 0 |                   0 |          1 |           1 | 57:25.0      | Zyu           |              181 | Mia           | FL             |          330 | US               |  |  |  |  |  |
| ZyR@aol.com                |                1 |                 0 |                   0 |          1 |           1 | 10:19.0      | Day           |              581 | Myr           | SC             |          295 | US               |  |  |  |  |  |
| zyr@gmail.com              |                1 |                 0 |                   0 |          1 |           1 | 25:19.0      | Nic           |              173 | Was           | DC             |          200 | US               |  |  |  |  |  |
| zy@gmail.com               |                1 |                 0 |                   0 |          1 |           1 | 19:18.0      | Kim           |              675 | MIA           | FL             |          331 | US               |  |  |  |  |  |
| zyou@gmail.com             |                1 |                 0 |                   0 |          1 |           1 | 40:29.0      | zoe           |              160 | Mob           | AL             |          366 | US               |  |  |  |  |  |
| zyon@yahoo.com             |                1 |                 0 |                   0 |          1 |           1 | 17:21.0      | Zyo           |              84  | Sta           | CT             |          690 | US               |  |  |  |  |  |
| zyo@gmail.com              |                1 |                 0 |                   0 |          2 |           2 | 02:03.0      | Zyo           |              432 | Ell           | GA             |          302 | US               |  |  |  |  |  |
| zyo@gmail.com              |                1 |                 0 |                   0 |          1 |           2 | 12:54.0      | Zyo           |              432 | Ell           | GA             |          302 | US               |  |  |  |  |  |
| zyn@icloud.com             |                1 |                 0 |                   0 |          1 |           1 | 54:56.0      | Zyn           |              916 | Nor           | CA             |          913 | US               |  |  |  |  |  |
| zyl@gmail.com              |                0 |                 1 |                   0 |          3 |           3 | 31:27.0      | Ser           |              123 | Mia           | FL             |          331 | US               |  |  |  |  |  |
| zyk@marketplace.amazon.com |                1 |                 0 |                   0 |          1 |           1 | 44:00.0      | Myr           |              101 | MIA           | FL             |          331 | US               |  |  |  |  |  |
+----------------------------+------------------+-------------------+---------------------+------------+-------------+--------------+---------------+------------------+---------------+----------------+--------------+------------------+--+--+--+--+--+

We define our customer by email. So all orders with the same email are marked to be under one customer and then we do calculations on top of that.

Now I am trying to find out about customers whose emails have changed. So to do this we will try to line up customers by their address.

So per each row (so when separated by email), I want to have another column called something like Orders_With_Same_Address_Different_Email. How would I do that?

I have tried doing something with Dense Rank but it doesn't seem to work:

SELECT DISTINCT
Email
,BrandNewCustomer
,RecurringCustomer
,ReactivatedCustomer
,OrderCount
,TotalOrders
,Date_Created
,Customer_Name
,Customer_Address
,Customer_City
,Customer_State
,Customer_Zip
,Customer_Country
,(DENSE_RANK() over (partition by Email order by (case when email <> email then Customer_Address end)  asc) 
+DENSE_RANK() over ( partition by Email order by (case when email <> email then Customer_Address end)  desc) 
- 1) as Orders_With_Same_Name_Different_Email
--*
FROM Customers

标签: sqlsql-serverdense-rank

解决方案


尝试计算按地址分区的电子邮件,而不是按电子邮件:

select   Email,
         -- ...

         Orders_With_Same_Name_Different_Email = iif(
             (count(email) over (partition by Customer_Address) > 1, 
         1, 0)

from     Customers;

但这是一个教训,说明您为什么不使用电子邮件作为客户的标识符。地址也是一个坏主意。使用不会改变的东西。这通常意味着制作一个内部标识符,例如自动递增的东西:

alter table #customers
add customerId int identity(1,1) primary key not null

现在 customerId = 1 将始终指代该特定客户。


推荐阅读