首页 > 解决方案 > 在postgres中计算连续天数

问题描述

我正在尝试使用以下结构计算两个表中的连续天数:

| id | email | timestamp |
| -------- | -------------- | -------------- |
| 1    | hello@example.com  | 2021-10-22 00:35:22 | 
| 2    | hello2@example.com  | 2021-10-21 21:17:41 |
| 1    | hello@example.com  | 2021-10-19 00:35:22 | 
| 1    | hello@example.com  | 2021-10-18 00:35:22 | 
| 1    | hello@example.com  | 2021-10-17 00:35:22 | 

我想计算连续活动的天数。上面的数据将显示:

| id | email | length |
| -------- | -------------- | -- |
| 1    | hello@example.com  | 1 | 
| 2    | hello2@example.com  | 1 |
| 1    | hello@example.com  | 3 | 

这变得更加困难,因为我需要使用 UNION 加入两个表(或类似的东西,然后运行分组。我试图建立在这个查询上(在 postgres 中查找系列的长度)但我无法分组连续几天。

select max(id) as max_id, email, count(*) as length
from (
    select *, row_number() over wa - row_number() over wp as grp
    from began_playing_video
    window
        wp as (partition by email order by id desc),
        wa as (order by id desc)
    ) s
group by email, grp
order by 1 desc

关于如何在 Postgres 中做到这一点的任何想法?

标签: postgresql

解决方案


首先创建一个aggregate函数来计算一个升序列表中的相邻日期。使用 jsonb 数据类型是因为它允许在同一个数组中混合各种数据类型:

CREATE OR REPLACE FUNCTION count_date(x jsonb, y jsonb, d date)
RETURNS jsonb LANGUAGE sql AS
$$
    SELECT CASE 
             WHEN d IS NULL
             THEN COALESCE(x,y)
             ELSE
               to_jsonb(d :: text)
            || CASE
                 WHEN COALESCE(x,y) = '[]' :: jsonb
                 THEN '[1]' :: jsonb
                 WHEN COALESCE(x->>0, y->>0) :: date + 1 = d :: date
                 THEN jsonb_set(COALESCE(x-0, y-0), '{-1}', to_jsonb(COALESCE(x->>-1, y->>-1) :: integer + 1))
                 ELSE COALESCE(x-0, y-0) || to_jsonb(1)
               END
             END ;
$$

DROP AGGREGATE IF EXISTS count_date(jsonb, date) ;
CREATE AGGREGATE count_date(jsonb, date)
(
  sfunc = count_date
, stype = jsonb
) ;

count_date然后在按 id 分组的表上迭代:

WITH list AS (
SELECT id, email, count_date('[]', timestamp ORDER BY timestamp :: timestamp) as count_list
  FROM your_table
 GROUP BY id, email
)
SELECT id, email, jsonb_array_elements(count_list-0) AS length
  FROM list

推荐阅读