首页 > 解决方案 > 过滤 row_number 结果的重复项

问题描述

我正在尝试在 PostgreSQL 上进行查询,该查询为我提供了每月花费更多时间的前 10 个工作(不包括当月),到目前为止,我已经进行了此查询,但它给了我工作名称的重复项。我该如何过滤这些?

SELECT job, month, duration  
FROM (
    SELECT
    month,
    job,
    duration,
    ROW_NUMBER() OVER (PARTITION BY month ORDER BY duration DESC) AS RN
FROM
    run_history
WHERE
    owner = 'john'
    ) x
WHERE RN <= 10
AND month < TO_CHAR(CURRENT_DATE, 'yyyymm')

标签: postgresql

解决方案


听起来每行可以有多行,(owner, month, job)并且您希望使用每个作业每月的最长持续时间。

如果是这样,请先聚合计算max(duration),然后row_number()在其之上使用:

SELECT job, month, max_duration
FROM  (
   SELECT month, job, max(duration) AS max_duration
        , row_number() OVER (PARTITION BY month ORDER BY max(duration) DESC NULLS LAST) AS rn
   FROM   run_history
   WHERE  owner = 'john'
   AND    month < to_char(CURRENT_DATE, 'yyyymm')
   GROUP  BY month, job
   ) sub
WHERE  rn <= 10
ORDER  BY month DESC, rn;

旁白:考虑integerdate代替text专栏month:更清洁,更高效。


推荐阅读