首页 > 解决方案 > 分区表上的 Postgres 查询比非分区表慢 2 倍

问题描述

我们有一个包含 400 万条记录的表,并且我们为该表创建了分区,假设选择查询在启用分区的表上会更快。但是,启用分区的表上的选择慢了 2 倍!

  1. 在普通桌子上(24 毫秒
    explain analyse select * from tbl_original where device_info_id = 5;

  2. 在启用分区的表上(49 毫秒
    explain analyse select * from tbl_partitioned where device_info_id = 5;

以下是EXPLAIN ANALYZE命令的输出tbl_original

QUERY PLAN                                                                                                                    
------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on tbl_original  (cost=61.19..9515.02 rows=2679 width=379) (actual time=0.297..13.008 rows=3369 loops=1)     
  Recheck Cond: (device_info_id = 5)                                                                                          
  Heap Blocks: exact=554                                                                                                      
  ->  Bitmap Index Scan on idx_tbl_original  (cost=0.00..60.52 rows=2679 width=0) (actual time=0.232..0.232 rows=3369 loops=1)
        Index Cond: (device_info_id = 5)                                                                                      
Planning time: 0.087 ms                                                                                                       
Execution time: 24.890 ms                                                                                                     

以下是EXPLAIN ANALYZE命令的输出tbl_partitioned

QUERY PLAN                                                                                                                                                 
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Append  (cost=0.00..6251.14 rows=3697 width=404) (actual time=0.034..36.635 rows=3369 loops=1)                                                             
  ->  Seq Scan on tbl_partitioned  (cost=0.00..0.00 rows=1 width=1069) (actual time=0.006..0.006 rows=0 loops=1)                                           
        Filter: (device_info_id = 5)                                                                                                                       
  ->  Index Scan using idx_tbl_partitioned_p1 on tbl_partitioned_p1  (cost=0.42..6251.14 rows=3696 width=404) (actual time=0.017..12.922 rows=3369 loops=1)
        Index Cond: (device_info_id = 5)                                                                                                                   
Planning time: 0.184 ms                                                                                                                                    
Execution time: 49.129 ms                                                                                                                                  

看起来分区查询中最昂贵的操作是索引扫描,需要6251.14个单位。但是,考虑到分区表与原始表相比的大小,此索引扫描应该非常快。不确定我们是否在这里遗漏了任何明显的东西!

任何优化查询/分区表的帮助将不胜感激。

注意:分区表是使用以下内容创建的:

CREATE TABLE tbl_partitioned (LIKE tbl_original);

CREATE TABLE tbl_partitioned_p1 (
    CONSTRAINT pk_tbl_partitioned_p1 PRIMARY KEY (id),
    CONSTRAINT ck_tbl_partitioned_p1 CHECK ( device_info_id < 10 )
) INHERITS (tbl_partitioned);

CREATE INDEX idx_tbl_partitioned_p1 ON tbl_partitioned_p1 (device_info_id);
CREATE INDEX idx_tbl_partitioned ON tbl_partitioned (device_info_id);

INSERT INTO tbl_partitioned_p1 SELECT * from tbl_original where device_info_id < 10;

桌子的大小是:

select count(*) from tbl_partitioned; -- 413696 rows
select count(*) from tbl_original;    -- 4417025 rows

select count(*) from tbl_original where device_info_id = 5; -- 3369 rows

constraint_exclusion被设定为partition

标签: sqlpostgresqlquery-performancedatabase-partitioningpostgresql-9.6

解决方案


尝试获取更多解释数据,例如:

解释(分析,时间,成本,缓冲区,详细)选择 * from tbl_original where device_info_id = 5;

特别要注意输出中的“命中”,例如:

缓冲区:共享命中=4 读取=224

Read=xxx 意味着必须从磁盘读取一个块。Hit= 表示它来自 RAM(共享缓冲区)。您的更多数据可能位于共享缓冲区中——性能非常依赖于此。


推荐阅读