首页 > 解决方案 > sql 使用窗口函数查找高

问题描述

我有一张这样的桌子

+------+------+-------+--+
| Name | Date | Price |  |
+------+------+-------+--+
| x    | d1   |    50 |  |
| x    | d2   |    45 |  |
| x    | d3   |    55 |  |
| x    | d4   |    40 |  |
| x    | d5   |    48 |  |
| x    | d6   |    45 |  |
| x    | d7   |    44 |  |
| y    | d1   |   200 |  |
| y    | d2   |   205 |  |
| y    | d3   |   204 |  |
| y    | d4   |   189 |  |
| y    | d5   |   185 |  |
| y    | d6   |   187 |  |
| y    | d7   |   182 |  |
+------+------+-------+--+

我想在接下来的 3 天内找到每个名称和日期的高价。

输出:

+------+------+-------+---------+-----------+--+
| Name | Date | Price | High_pr | High_date |  |
+------+------+-------+---------+-----------+--+
| x    | d1   |    50 |      55 | d3        |  |
| x    | d2   |    45 |      55 | d3        |  |
| x    | d3   |    55 |      55 | d3        |  |
| x    | d4   |    40 |      48 | d5        |  |
| x    | d5   |    48 |      48 | d5        |  |
| x    | d6   |    45 |      45 | d6        |  |
| x    | d7   |    44 |      44 | d7        |  |
| y    | d1   |   200 |     205 | d2        |  |
| y    | d2   |   205 |     205 | d2        |  |
| y    | d3   |   204 |     204 | d3        |  |
| y    | d4   |   189 |     189 | d4        |  |
| y    | d5   |   185 |     187 | d6        |  |
| y    | d6   |   187 |     187 | d6        |  |
| y    | d7   |   182 |     182 | d7        |  |
+------+------+-------+---------+-----------+--+

尝试使用窗口功能following '3 day'但无法正常工作。

该表确实很大,有许多名称跨越 5 年的数据,需要一些最佳解决方案。谢谢你

PS:使用 PostgreSQL 12.4。

标签: sqlpostgresqldatetimesubquerywindow-functions

解决方案


我想知道这是否可以在不使用横向连接的情况下合理地完成。窗口函数似乎不是最佳的,但以下使用相关子查询确实有效:

select t.name, t.date, t.price, max_price_3,
       (select pd.date
        from unnest(t.pd) pd (date date, price int)
        where pd.price = t.max_price_3
        limit 1
       ) as date_3
from (select t.*,
             max( price ) over (partition by name order by date range between current row and interval '3 day' following) as max_price_3,
             array_agg( (date, price) ) over (partition by name order by date range between current row and interval '3 day' following) as pd
      from t
     ) t ;

是一个 db<>fiddle。

有一种方法可以只使用窗口函数来做到这一点。. . 但使用generate_series(). 这有点复杂,因为原始日期/价格在该期间的天数中向后传播——然后取最大行:

select name, dte, price, price_3, date as date_3
from (select t.*, dte, min(date) over (partition by name) as min_date,
             max(t.price) over (partition by t.name, gs.dte) as price_3,
             row_number() over (partition by t.name, gs.dte order by price desc) as seqnum
      from t cross join
           generate_series(date, date - interval '3 day', interval '-1 day') gs(dte)
     ) t
where seqnum = 1 and dte >= min_date
order by t.name, t.date;

推荐阅读