首页 > 解决方案 > Efficient way to combine 2 tables and get the row with max year with preference to one of the table

问题描述

I am trying to combine 2 tables (key_ratios_cnd and key_ratios_snd) both tables are identical and primary key columns for both tables are symbol and fiscal_year.

In the final result set i want the row with maximum year in both the tables for each symbol. if the row with maximum year is present in both the tables then row from key_ratios_cnd should be selected.

I come up with below SQL query to give the result. I wanted to know if their are any other way to write the query that is more optimized.

select sq2.*
from 
    (select sq.*,
           max(id) over(partition by sq.symbol) as max_id,
           max(fiscal_year) over(partition by sq.symbol) as max_year
     from 
        ( select *,'2' as id
        from test.key_ratios_cnd
        union all
        select *,'1' as id
        from test.key_ratios_snd
        ) as sq
    ) as sq2
where id = max_id and fiscal_year = max_year
order by symbol asc

标签: sqlpostgresql

解决方案


我会先从每个表中选择一行然后合并。Postgresdistinct on非常适合这个目的。

select distinct on (symbol) sc.*
from ((select distinct on (cnd.symbol) cnd.*, 1 as ord
       from test.key_ratios_cnd cnd
       order by cnd.symbol, cnd.fiscal_year desc
      ) union all
      (select distinct on (snd.symbol) cnd.*, 2 as ord
       from test.key_ratios_cnd cnd
       order by snd.symbol, snd.fiscal_year desc
      )
     ) sc
order by symbol, fiscal_year desc, ord;

为了加快速度,(symbol, fiscal_year desc)为每个表添加一个索引。


推荐阅读