sql - 在 Redshift 查询中为 max(date) 函数指定意外事件
问题描述
我试图确定客户在日历周内何时不登录,他们最后一次登录是什么时候。起始数据集如下所示:
User_Name Login_Date Week_Ending
Bobisaur 1/1/2019 1/5/2019
Bobisaur 1/3/2019 1/5/2019
Bobisaur 3/5/2019 3/9/2019
Bobisaur 3/24/2019 3/30/2019
Bobisaur 4/1/2019 4/6/2019
现在到周末1/12/2019
等等,客户将没有登录Bobisaur
我想做的是最终得到一个如下所示的数据集:
User_Name Week_Ending (for weeks with no login) Days Since Last Login (as of week ending date)
Bobisaur 1/12/2019 7
Bobisaur 1/19/2019 14
Bobisaur 1/26/2019 21
Bobisaur 2/2/2019 28
Bobisaur 2/9/2019 35
Bobisaur 2/16/2019 42
Bobisaur 2/23/2019 49
Bobisaur 3/2/2019 56
Bobisaur 3/16/2019 11
Bobisaur 3/23/2019 18
Bobisaur 4/13/2019 12
这将过滤到他们没有登录的用户的周列表中,并有一列显示自他们上次登录以来已经过了多少天,截至该特定的周末日期。
我能够获得Week_Ending
没有登录活动的日期,但是,我一直在计算“自上次登录以来的天数(截至周末日期)”。我尝试做的是使用(Week_Ending - max(Login_Date))
,然后用max(Login_Date) <= Week_Ending
.
Week_Ending
但是,这基本上删除了值早于 max的所有行Login_Date
。
任何帮助将不胜感激。谢谢!
解决方案
结果查询看起来很复杂,但还不错:
intput_raw
- 无论您在问题中粘贴什么input
login_data
-与各自交换week_ending
cal
- 带有连续星期六的日历表(您对周结束的定义)。在 Redshift 中,生成行的唯一方法是SELECT
. 在这里,我通过将输入与自身交叉连接来生成 25 行,以获得 25 (5x5) 个连续的星期六。CROSS JOIN
可以改成SELECT row_number() over () from arbitrary table limit 25
。cal_usrs
- 所有用户和所有星期六的桌子output_raw
- 计算的核心 - 首先JOIN
cal_usrs
使用input
并使用窗口函数来获取自上次登录以来的天数。由于不可能对带有窗口函数结果的列进行过滤(并且您所需的输出没有带有 的行0
),因此有一个最终的SELECT
.- final
SELECT
- 只选择我们感兴趣的。
查询:
with input_raw as (
select 'Bobisaur' as username, '1/1/2019'::date as login_date
union all
select 'Bobisaur', '1/3/2019'::date
union all
select 'Bobisaur', '3/5/2019'::date
union all
select 'Bobisaur', '3/24/2019'::date
union all
select 'Bobisaur', '4/1/2019'::date
), input as (
select
username,
-- in your example weeks ends on saturday hence Monday + 5 days
date_trunc('week', login_date) + interval '5 days' as week_ending
from input_raw
), cal as (
-- this will create a table with consecutive Saturdays
select
date_trunc('week', '12/1/2018'::date) + interval '5 days'+ 7 * row_number() over () as week_ending
-- can be changed to 'from arbitrary table limit 25' or whatever time window you wish
from input_raw a cross join input_raw b --this will produce 25 rows
), cal_usrs as (
select * from cal cross join (select distinct username from input) as u
-- this is very important - you want to have all weeks with all users
), output_raw as (
select cal_usrs.username,
cal_usrs.week_ending,
max(input.week_ending)
over (partition by cal_usrs.username order by cal_usrs.week_ending rows between unbounded preceding and current row ) as last_login_week,
extract('days' from cal_usrs.week_ending - last_login_week) as days_since_last_login
from input
right join cal_usrs using (username, week_ending)
)
select
username,
to_char(week_ending, 'MM/DD/YYYY') as week_anding,
days_since_last_login
from output_raw
where days_since_last_login <> 0 -- your example did not contain 0 rows
order by week_ending
结果看起来像(我认为您在 3 月 16 日之后的示例中计算错误):
username week_ending days_since_last_login
Bobisaur 01/12/2019 7
Bobisaur 01/19/2019 14
Bobisaur 01/26/2019 21
Bobisaur 02/02/2019 28
Bobisaur 02/09/2019 35
Bobisaur 02/16/2019 42
Bobisaur 02/23/2019 49
Bobisaur 03/02/2019 56
Bobisaur 03/16/2019 7
Bobisaur 03/30/2019 7
Bobisaur 04/13/2019 7
Bobisaur 04/20/2019 14
Bobisaur 04/27/2019 21
Bobisaur 05/04/2019 28
Bobisaur 05/11/2019 35
Bobisaur 05/18/2019 42
Bobisaur 05/25/2019 49
推荐阅读
- javascript - 将事件处理程序方法命名为“onEvent”的用例是什么
- assembly - 来自程序集的 Printf 调用不打印到标准输出
- angular - Angular2递归组件 - 设置输入自动对焦
- raspberry-pi - eInk 屏幕导致 RaspberryPi0W 重新启动
- sqlite - 用于评论功能的 SQLite 数据库
- javascript - 连接 unicode 和变量
- python-telegram-bot - 如何使用 python-telegram-bot 制作示例中的按钮?
- python - 如何搜索/检测指定路径中的所有目录/子目录?
- mongodb - 在不考虑键顺序的情况下查询 MongoDB 子文档
- matlab - Matlab数据结构