首页 > 解决方案 > SQL 窗口聚合和计数

问题描述

我正在使用 Redshift/Postgres 数据库。考虑下表:

    +---------------------------------------+
    |              FunkyUsers               |
    +-------+---------+----------+----------+
    |acc_num| user_id |   date   | is_valid |
    +-------+---------+----------+----------+
    |   a1  |    u1   | 20200201 |    true  |
    |   a1  |    u1   | 20200201 |    true  |
    |   a1  |    u1   | 20200311 |    true  |
    |   a1  |    u2   | 20200201 |   false  |
    |   a1  |    u2   | 20200201 |   false  |
    |   a1  |    u2   | 20200201 |   false  |
    |   a1  |    u3   | 20111201 |    true  |
    |   a1  |    u3   | 20111201 |    true  |
    |   a1  |    u3   | 20111201 |    true  |
    +-------+---------+----------+----------+

我喜欢有以下输出:

    +--------------------------------------------------+
    |                  FunkyUsers                      |
    +-------+---------+----------+----------+----------+
    |acc_num| user_id |   date   | is_valid |  count   |
    +-------+---------+----------+----------+----------+
    |   a1  |    u1   | 20200201 |    true  |    2     |
    |   a1  |    u1   | 20200201 |    true  |    2     |
    +-------+---------+----------+----------+----------+
    |   a1  |    u1   | 20200311 |    true  |    2     |
    +-------+---------+----------+----------+----------+
    |   a1  |    u2   | 20200201 |   false  |    0     |
    |   a1  |    u2   | 20200201 |   false  |    0     |
    |   a1  |    u2   | 20200201 |   false  |    0     |
    +-------+---------+----------+----------+----------+
    |   a1  |    u2   | 20111201 |    true  |    1     |
    |   a1  |    u2   | 20111201 |    true  |    1     |
    |   a1  |    u2   | 20111201 |    true  |    1     |
    +-------+---------+----------+----------+----------+

描述:

标签: sqlpostgresqlwindow-functions

解决方案


据我了解,您想为用户计算 is_valid = 1 的不同日期。您可以使用带有不同参数的窗口函数计数作为日期。

select 
    acc_num, user_id, date, is_valid, 
    count(distinct case when is_valid then date end) over (partition by acc_num, user_id, is_valid)
from FunkyUsers

有人告诉我,在 Redshift 的窗口函数中不能使用 distinct。因此,您可以使用以下查询:

with
counts as
(
    SELECT acc_num, user_id, is_valid, COUNT(DISTINCT CASE WHEN is_valid THEN date END) as count
    FROM FunkyUsers
    GROUP BY acc_num, user_id, is_valid
)
SELECT f.*, c.count
FROM FunkyUsers f
LEFT JOIN counts c
    ON f.acc_num = c.acc_num
    AND f.user_id = c.user_id
    AND f.is_valid = c.is_valid

推荐阅读