sql - SQL count duplicates in another column based on one field per row
问题描述
I am building out a customer retention report. We identify customers by their email. Here is some sample data from our table:
+----------------------------+------------------+-------------------+---------------------+------------+-------------+--------------+---------------+------------------+---------------+----------------+--------------+------------------+--+--+--+--+--+
| Email | BrandNewCustomer | RecurringCustomer | ReactivatedCustomer | OrderCount | TotalOrders | Date_Created | Customer_Name | Customer_Address | Customer_City | Customer_State | Customer_Zip | Customer_Country | | | | | |
+----------------------------+------------------+-------------------+---------------------+------------+-------------+--------------+---------------+------------------+---------------+----------------+--------------+------------------+--+--+--+--+--+
| zyw@marketplace.amazon.com | 1 | 0 | 0 | 1 | 1 | 41:50.0 | Sha | 990 | BRO | NY | 112 | US | | | | | |
| zyu@gmail.com | 1 | 0 | 0 | 1 | 1 | 57:25.0 | Zyu | 181 | Mia | FL | 330 | US | | | | | |
| ZyR@aol.com | 1 | 0 | 0 | 1 | 1 | 10:19.0 | Day | 581 | Myr | SC | 295 | US | | | | | |
| zyr@gmail.com | 1 | 0 | 0 | 1 | 1 | 25:19.0 | Nic | 173 | Was | DC | 200 | US | | | | | |
| zy@gmail.com | 1 | 0 | 0 | 1 | 1 | 19:18.0 | Kim | 675 | MIA | FL | 331 | US | | | | | |
| zyou@gmail.com | 1 | 0 | 0 | 1 | 1 | 40:29.0 | zoe | 160 | Mob | AL | 366 | US | | | | | |
| zyon@yahoo.com | 1 | 0 | 0 | 1 | 1 | 17:21.0 | Zyo | 84 | Sta | CT | 690 | US | | | | | |
| zyo@gmail.com | 1 | 0 | 0 | 2 | 2 | 02:03.0 | Zyo | 432 | Ell | GA | 302 | US | | | | | |
| zyo@gmail.com | 1 | 0 | 0 | 1 | 2 | 12:54.0 | Zyo | 432 | Ell | GA | 302 | US | | | | | |
| zyn@icloud.com | 1 | 0 | 0 | 1 | 1 | 54:56.0 | Zyn | 916 | Nor | CA | 913 | US | | | | | |
| zyl@gmail.com | 0 | 1 | 0 | 3 | 3 | 31:27.0 | Ser | 123 | Mia | FL | 331 | US | | | | | |
| zyk@marketplace.amazon.com | 1 | 0 | 0 | 1 | 1 | 44:00.0 | Myr | 101 | MIA | FL | 331 | US | | | | | |
+----------------------------+------------------+-------------------+---------------------+------------+-------------+--------------+---------------+------------------+---------------+----------------+--------------+------------------+--+--+--+--+--+
We define our customer by email. So all orders with the same email are marked to be under one customer and then we do calculations on top of that.
Now I am trying to find out about customers whose emails have changed. So to do this we will try to line up customers by their address.
So per each row (so when separated by email), I want to have another column called something like Orders_With_Same_Address_Different_Email. How would I do that?
I have tried doing something with Dense Rank but it doesn't seem to work:
SELECT DISTINCT
Email
,BrandNewCustomer
,RecurringCustomer
,ReactivatedCustomer
,OrderCount
,TotalOrders
,Date_Created
,Customer_Name
,Customer_Address
,Customer_City
,Customer_State
,Customer_Zip
,Customer_Country
,(DENSE_RANK() over (partition by Email order by (case when email <> email then Customer_Address end) asc)
+DENSE_RANK() over ( partition by Email order by (case when email <> email then Customer_Address end) desc)
- 1) as Orders_With_Same_Name_Different_Email
--*
FROM Customers
解决方案
尝试计算按地址分区的电子邮件,而不是按电子邮件:
select Email,
-- ...
Orders_With_Same_Name_Different_Email = iif(
(count(email) over (partition by Customer_Address) > 1,
1, 0)
from Customers;
但这是一个教训,说明您为什么不使用电子邮件作为客户的标识符。地址也是一个坏主意。使用不会改变的东西。这通常意味着制作一个内部标识符,例如自动递增的东西:
alter table #customers
add customerId int identity(1,1) primary key not null
现在 customerId = 1 将始终指代该特定客户。
推荐阅读
- javascript - index.ts:38 未捕获的 SyntaxError:意外的标识符
- excel - 用于在 excel 中插入带有一些下拉值的表格的 VBA 代码
- python - tensorflow 2.0 gpu 出现问题,未知错误
- javascript - 如何解决 esline 错误——“使用对象解构”
- vbscript - VBscript 运行命令并在变量中显示输出
- mysql - 使用 MySQL 查询输出表时出现问题
- php - 引用用户并相应地限制表视图 xcrud 应用程序
- python - 在 TensorFlow 中设置循环神经网络时出现值错误
- docker - Gitlab-runner:在 root 中找不到 docker 或 docker-compose,但它们已经安装
- c# - 从 c# json 响应中删除双反斜杠