sql - SQL -- 如何编写一个select语句来查找连续5天下订单的客户
问题描述
orderId、orderDate、customerId 是这里的相关字段。
客户可能有超过 1 个连续 5 天。
我希望输出看起来像这样
customerID startDate endDate numDays 1 2020/01/01 2020/01/05 5 1 2020/10/1 2020/10/10 10 101 2020/04/10 2020/04/15 6
到目前为止,这就是我所拥有的:
;
with t1 as (
select distinct o.idcustomer,orderdate, dateadd(dd,1,orderdate) nextOrderDate, 1 as tday, orderstatus
from orders o
join customers c on c.idcustomer=o.idcustomer
where orderstatus in (3,4) and c.customertype=0
), t2 as (
select * from t1
union all
select o2.idcustomer, o2.orderdate, dateadd(dd,1,o2.orderdate), o.tday+1, o2.orderstatus
from t1 o2
join t2 o on o2.idcustomer=o.idcustomer and o2.orderdate=o.nextOrderDate and o2.orderstatus in (3,4)
)
--select idcustomer, max(tday) DaysInARow, min(orderDate) StartDate, max(orderdate) endDate
select idcustomer, dateadd(dd,-5,min(orderdate)) firstOrderDate, max(orderdate) lastOrderDate
from t2
where tday>=5
group by idcustomer, tday
order by idcustomer
解决方案
这是一个间隙和孤岛问题,您希望将客户有订单的连续天组合在一起。
如果客户每天最多有一个订单,您可以使用日期算法针对递增序列构建组。假设您正在运行 SQL Server,如当前查询的语法所示:
select customer_id, min(orderdate) startdate, max(orderdate) enddate, count(*) cnt
from (
select c.customerid, o.orderdate,
row_number() over(partition by customerid order by o.orderdate) rn
from orders o
inner join customers c on c.idcustomer = o.idcustomer
where o.orderstatus in (3, 4) and c.customertype = 0
) t
group by c.customer_id, dateadd(day, -rn, o.orderdate)
如果您只想显示连续 5 天以上,只需添加一个having
子句:
having count(*) >= 5
而且,如果您只想要每个客户的最大连胜记录(最小长度为 5):
select *
from (
select customer_id, min(orderdate) startdate, max(orderdate) enddate, count(*) cnt,
rank() over(partition by customer_id order by count(*) desc) rn2
from (
select c.customerid, o.orderdate,
row_number() over(partition by customerid order by o.orderdate) rn
from orders o
inner join customers c on c.idcustomer = o.idcustomer
where o.orderstatus in (3, 4) and c.customertype = 0
) t
group by c.customer_id, dateadd(day, -rn, o.orderdate)
having count(*) >= 5
) t
where rn2 = 1
如果有重复(customerid, orderdate)
,那么我们使用dense_rank()
代替row_number()
,和count(distinct orderdate)
代替count(*)
。
推荐阅读
- python - 更改 PIL 图像覆盖的颜色没有区别
- symfony - ApiPlatform - 在子资源路由上实现安全授权
- r - 如何更改 ggplot 中的标题以获取一系列情节?
- node.js - Node.js 托管:低带宽,高 CPU
- mongodb - 如何与现代 Mongo scala 驱动程序一起运行传统的 Mongo Casbah 驱动程序
- azure - 将组添加到 ADO 中的所有服务连接
- typescript - Next.js、Styled-components 和 Yandex Metrica 会话重放
- node.js - 将 socket.io 升级到 v3 后出现 express-session 问题
- merge - 使用 Spacy 的模型“en_core_web_md”时,合并跨度非常慢
- python - 在python相同长度的循环中组合两个列表