首页 > 解决方案 > 如何在 Postgresql 中获取特定案例的领先价值?

问题描述

我有一张如下表

CREATE TABLE events (
  id SERIAL PRIMARY KEY,
  group_id INT,
  code VARCHAR,
  created_date timestamptz
);

样本值

INSERT INTO events (group_id, code, created_date) VALUES 
(1, '0001', clock_timestamp()),
(1, '000A', clock_timestamp()),
(1, '0002', clock_timestamp()),
(2, '000A', clock_timestamp()),
(2, '0003', clock_timestamp()),
(2, '0004', clock_timestamp()),
(3, '0001', clock_timestamp()),
(3, '000A', clock_timestamp()),
(4, '0001', clock_timestamp()),
(5, '000A', clock_timestamp()),
(5, '0002', clock_timestamp()),
(5, '000A', clock_timestamp()),
(5, '0006', clock_timestamp());

我有一个查询

SELECT 
 DISTINCT ON (group_id) group_id, 
 last_value(code) over w_last AS last_code, 
 last_value(created_date) over w_last AS last_created_date
FROM events
WINDOW
 w_last AS (PARTITION BY group_id ORDER BY created_date ROWS BETWEEN unbounded preceding AND unbounded following);

查询结果

| group_id | last_code | last_created_date        |
| -------- | --------- | ------------------------ |
| 1        | 0002      | 2020-11-18T09:25:53.443Z |
| 2        | 0004      | 2020-11-18T09:25:53.443Z |
| 3        | 000A      | 2020-11-18T09:25:53.443Z |
| 4        | 0001      | 2020-11-18T09:25:53.443Z |
| 5        | 0006      | 2020-11-18T09:25:53.443Z |

我正在尝试在最后一个“000A”事件之后获得一个额外的字段来显示事件

 Eg. for group_id 1 event code after 000A is 0002
     for group_id 2 event code after 000A is 0003
     for group_id 3 event code after 000A is NULL
     for group_id 4 event code after 000A is NULL
     for group_id 5 event code after 000A is 0006 (from last)

预期产出

| group_id | last_code | after_last_000A | last_created_date        |
| -------- | --------- | --------------- | ------------------------ |
| 1        | 0002      |     0002        | 2020-11-18T09:25:53.443Z |
| 2        | 0004      |     0003        | 2020-11-18T09:25:53.443Z |
| 3        | 000A      |     NULL        | 2020-11-18T09:25:53.443Z |
| 4        | 0001      |     NULL        | 2020-11-18T09:25:53.443Z |
| 5        | 0006      |     0006        | 2020-11-18T09:25:53.443Z |

DB Fiddle中也一样

标签: sqlpostgresqlwindow-functions

解决方案


分步演示:db<>fiddle

SELECT DISTINCT ON (group_id)                                              -- 4
    group_id,
    code as last_code,
    first_value(after_000A)                                                -- 3
        OVER (PARTITION BY group_id ORDER BY after_000A DESC NULLS LAST) as after_last,
    created_date as last_created_date
FROM (
    SELECT 
        *,
        CASE WHEN code = '000A' THEN                                       -- 2
            lead(code) OVER (PARTITION BY group_id ORDER BY created_date)  -- 1
        END as after_000A
    FROM 
        events
) s
ORDER BY group_id, created_date DESC                                       -- 4
  1. lead()如果可用,则返回下一行的值,NULL否则
  2. CASE子句消除所有不相关的值。因此,仅显示代码的前导值'000A'。这是新after_000A专栏
  3. first_value()给出有序分区的第一个值. 在这里,group_id分区按先前创建的after_000A列的值排序(降序获取最后一个)。
  4. DISTINCT ON (group_id)每组只返回一条记录group_id,在这种情况下是最后一条(按created_datedesc 排序)。

推荐阅读