首页 > 解决方案 > SQL Spark对多列求和和分组

问题描述

我有一个表格,其中包含“customerId”列和“分类:每周买家、每月买家和每年买家”。

基于最后一个订单之间的订单天数的分类条件已经完成。

select customerId,  dayofyear(date) - lag(dayofyear(date)) over(partition by customerId order by date) as Differenz, 
case when (dayofyear(date) - lag(dayofyear(date)) over(partition by customerId order by date)) <= 9 then "weekly buyer"
      when (dayofyear(date) - lag(dayofyear(date)) over(partition by customerId order by date)) between 10 and 19 then "every two weeks"
      when (dayofyear(date) - lag(dayofyear(date)) over(partition by customerId order by date)) between 20 and 40 then "monthly buyer"
      when (dayofyear(date) - lag(dayofyear(date)) over(partition by customerId order by date)) > 40 then "occastional buyer"
      else null
      end as Einstufung
from orders 
order by customerId, date 

我想知道客户最后是每周一次、每两周一次、每月一次还是偶尔购买一次。

客户可以多次订购,因此customerId = 1 每周、每周和每月订购。所以总而言之,他是一个微不足道的买家。如何在 sql 中定义它?

我在excel中简化了表格:

在此处输入图像描述

标签: apache-spark-sql

解决方案


推荐阅读