首页 > 解决方案 > SQL 按字段分组,每个分组只返回一个连接行

问题描述

桌子data

+-----+----------------+--------+----------------+
| ID  |  Required_by   |  Name  |  Another_Field |
+-----+----------------+--------+----------------+
| 1   |  7 August      |  cat   |  X             |
| 2   |  7 August      |  cat   |  Y             |
| 3   |  10 August     |  cat   |  Z             |
| 4   |  11 August     |  dog   |  A             |
+-----+----------------+--------+----------------+

我想要做的是按名称分组,然后为每个组选择日期要求最早的行之一。

对于这个数据集,我想以第 1 行和第 4 行或第 2 行和第 4 行结束。

预期结果:

+-----+----------------+--------+----------------+
| ID  |  Required_by   |  Name  |  Another_Field |
+-----+----------------+--------+----------------+
| 1   |  7 August      |  cat   |  X             |
| 4   |  11 August     |  dog   |  A             |
+-----+----------------+--------+----------------+

或者

+-----+----------------+--------+----------------+
| ID  |  Required_by   |  Name  |  Another_Field |
+-----+----------------+--------+----------------+
| 2   |  7 August      |  cat   |  Y             |
| 4   |  11 August     |  dog   |  A             |
+-----+----------------+--------+----------------+

我有一些返回 1,2 和 4 的东西,但我不确定如何只从第一组中选择一个来获得所需的结果。我正在将分组与data表格一起加入,以便在分组后获得IDanother_field返回。

SELECT d.id, d.name, d.required_by, d.another_field
FROM 
(
  SELECT min(required_by) as min_date, name
  FROM data
  GROUP BY name
) agg
INNER JOIN 
data d
on d.required_by = agg.min_date AND d.name = agg.name

标签: sqlpostgresqlgroup-bygreatest-n-per-grouppostgres-9.6

解决方案


这通常使用窗口函数来解决:

select d.id, d.name, d.required_by, d.another_field
from (
  select id, name, required_by, another_field, 
         row_number() over (partition by name order by required_by) as rn
  from data
) d
where d.rn = 1;

在 Postgres 中使用distinct on()通常更快:

select distinct on (name) *
from data
order by name, required_by

在线示例


推荐阅读