首页 > 解决方案 > 长时间等待 7000 万条记录表上的 SELECT 查询。如何提高性能?

问题描述

我在 Postgres 中有一个表,其中包含超过 7000 万条记录,这些记录将温度与特定时间(天)和空间(气象站)相关联。我需要在给定一段时间和一组气象站的情况下进行一些计算,例如总和、平均值、四分位数和正常值。我正在使用返回需要 30 秒。我该如何改善这种等待?

这是explain(analyze, buffers) select avg(p) as rain FROM waterbalances group by extract(month from date), extract(year from date);

                                                                           QUERY PLAN                                   
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize GroupAggregate  (cost=3310337.68..3314085.15 rows=13836 width=24) (actual time=21252.008..21252.624 rows=478 loops=1)
   Group Key: (date_part('month'::text, (date)::timestamp without time zone)), (date_part('year'::text, (date)::timestamp without time zone))
   Buffers: shared hit=6335 read=734014
   ->  Gather Merge  (cost=3310337.68..3313566.30 rows=27672 width=48) (actual time=21251.984..21261.693 rows=1432 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         Buffers: shared hit=15841 read=2195624
         ->  Sort  (cost=3309337.66..3309372.25 rows=13836 width=48) (actual time=21130.846..21130.862 rows=477 loops=3)               Sort Key: (date_part('month'::text, (date)::timestamp without time zone)), (date_part('year'::text, (date)::timestamp without time zone))
               Sort Method: quicksort  Memory: 92kB
               Worker 0:  Sort Method: quicksort  Memory: 92kB
               Worker 1:  Sort Method: quicksort  Memory: 92kB
               Buffers: shared hit=15841 read=2195624
               ->  Partial HashAggregate  (cost=3308109.29..3308386.01 rows=13836 width=48) (actual time=21130.448..21130.618 rows=477 loops=3)
                     Group Key: date_part('month'::text, (date)::timestamp without time zone), date_part('year'::text, (date)::timestamp without time zone)
                     Buffers: shared hit=15827 read=2195624
                     ->  Parallel Seq Scan on waterbalances  (cost=0.00..3009020.66 rows=39878483 width=24) (actual time=1.528..15460.388 rows=31914000 loops=3)
                           Buffers: shared hit=15827 read=2195624
 Planning Time: 7.621 ms
 Execution Time: 21262.552 ms
(20 rows)

标签: postgresql

解决方案


推荐阅读