sql - SQL:如何在滑动窗口框架中使用线性回归或 regr_slope()?
问题描述
我有以下代码来生成示例表。
create or replace table aggr(ACCT_ID NUMBER(18,0), YYYYMMDD NUMBER(18,0), value NUMBER(10, 2));
insert into aggr values(1,0, 0);
insert into aggr values(1,30, 90);
insert into aggr values(1,60, 60);
insert into aggr values(1,90, 90);
insert into aggr values(2,0, 0);
insert into aggr values(2,30, 90);
insert into aggr values(2,60, 60);
insert into aggr values(2,90, 90);
我想要实现的是对于 each acc_id
,regr_slope(y, x)
可以为给定前两行和当前行 wherex = YYYYMMDD
和的每一行计算y = value
。在这种情况下acct_id=1
,对于每个帐户的前两行,由于它没有前两行,它将返回NULL
。第三行,regr_slope()
将使用 row1,row2,row3 来获取输出。
然后我尝试使用以下代码,但失败了
SELECT
regr_slope(value, YYYYMMDD) over (PARTITION BY acct_id order by YYYYMMDD rows between
2 preceding and current row) as avg_3mo_hist
FROM
aggr
我收到以下错误:
SQL compilation error: error line 2 at position 79 Sliding window frame unsupported for function REGR_SLOPE
根据雪花文件,它说regr_slope()
When used as a window function:
This function does not support:
- ORDER BY sub-clause in the OVER() clause.
- Window frames.
我尝试了很多不同的方法,但都没有奏效。想知道如何用regr_slope()
函数实现目标
解决方案
使用窗口函数来获取“3 个月前”并不是一个安全的操作,因为如果缺少一个月,那么突然间你会覆盖 4、5 或更多个月。
相反,您可以使用自联接解决前 3 个月的聚合:
SELECT
a.ACCT_ID, a.YYYYMMDD, ARRAY_AGG(ARRAY_CONSTRUCT(b.value, b.YYYYMMDD)) arragg
-- over (PARTITION BY acct_id order by YYYYMMDD rows between 2 preceding and current row) as avg_3mo_hist
FROM aggr a
JOIN aggr b
ON a.acct_id=b.acct_id
-- replace with real dates and date math
AND a.YYYYMMDD BETWEEN b.YYYYMMDD AND b.YYYYMMDD+90
GROUP BY 1, 2
ORDER BY 1, 2
有了这些数据,现在很容易获得所需的 3 个月斜率:
WITH window_90_day
AS (
SELECT
a.ACCT_ID, a.YYYYMMDD, ARRAY_AGG(ARRAY_CONSTRUCT(b.value, b.YYYYMMDD)) arragg
FROM aggr a
JOIN aggr b
ON a.acct_id=b.acct_id
-- replace with real dates and date math
AND a.YYYYMMDD BETWEEN b.YYYYMMDD AND b.YYYYMMDD+90
GROUP BY 1, 2
)
SELECT ACCT_ID, YYYYMMDD
, regr_slope(x.value[0], x.value[1]) slope
FROM window_90_day, table(flatten(arragg)) x
GROUP BY 1,2
推荐阅读
- reportbuilder3.0 - 基于参数的报表中的多个 Tablix 显示
- python - 如何在没有python中的熊猫的情况下删除csv文件中的重复行?
- c# - MVC QueryString 参数在 & 之后突然不读取
- scala - Scala中Ad-hoc多态和参数多态的区别
- java - 基于注解的 api 认证
- sql - SQL Server - Aggregate by number of records returned for all groups
- node.js - vue express uploading multiple files to amazon s3
- python - When executing thread, the first execution fails in try statement
- flask - Flask & Swagger:request.files 为空
- macos - 如何使用 mv 命令重命名末尾包含一致子字符串的文件夹?