apache-spark-sql - SQL Spark对多列求和和分组
问题描述
我有一个表格,其中包含“customerId”列和“分类:每周买家、每月买家和每年买家”。
基于最后一个订单之间的订单天数的分类条件已经完成。
select customerId, dayofyear(date) - lag(dayofyear(date)) over(partition by customerId order by date) as Differenz,
case when (dayofyear(date) - lag(dayofyear(date)) over(partition by customerId order by date)) <= 9 then "weekly buyer"
when (dayofyear(date) - lag(dayofyear(date)) over(partition by customerId order by date)) between 10 and 19 then "every two weeks"
when (dayofyear(date) - lag(dayofyear(date)) over(partition by customerId order by date)) between 20 and 40 then "monthly buyer"
when (dayofyear(date) - lag(dayofyear(date)) over(partition by customerId order by date)) > 40 then "occastional buyer"
else null
end as Einstufung
from orders
order by customerId, date
我想知道客户最后是每周一次、每两周一次、每月一次还是偶尔购买一次。
客户可以多次订购,因此customerId = 1 每周、每周和每月订购。所以总而言之,他是一个微不足道的买家。如何在 sql 中定义它?
我在excel中简化了表格: