首页 > 解决方案 > postgres 是否支持同一 jsonb 列的不同属性的多列索引?

问题描述

我创建了一个多列索引,在其中索引了同一 jsonb 列的 3 个属性。即使 where 子句使用了确切的 3 个属性,看起来也没有使用索引。这不支持吗?

我正在使用 postgres 9.6 并尝试了 btree 和 gin

这是杜松子酒指数

CREATE INDEX custom__observation__multicol  
    ON fhir.observation USING gin  
    (((resource -> 'subject'::text) -> 'reference'::text),  
(resource -> 'effectiveDateTime'::text),  
((resource -> 'code'::text) -> 'coding'::text))  
TABLESPACE pg_default;

查询计划如下所示

Bitmap Heap Scan on observation  (cost=600.20..2701.74 rows=1 width=766) (actual time=17749.389..25453.827 rows=1 loops=1)Filter: (is_latest AND (NOT is_deleted) AND ((resource -> 'effectiveDateTime'::text) = to_jsonb('2019-07-31T18:07:52Z'::text)) AND (((resource -> 'subject'::text) ->> 'reference'::text) = 'Patient/e69583dc510c4ed59e362e3ac8c803b5'::text))
      Recheck Cond: (((resource -> 'code'::text) -> 'coding'::text) @> '[{"code": "57485005", "system": "http://snomed.info/sct", "display": "Support"}]'::jsonb)
      Filter: (is_latest AND (NOT is_deleted) AND ((resource -> 'effectiveDateTime'::text) = to_jsonb('2019-07-31T18:07:52Z'::text)) AND (((resource -> 'subject'::text) ->> 'reference'::text) = 'Patient/e69583dc510c4ed59e362e3ac8c803b5'::text))
      Rows Removed by Filter: 93330  
      Heap Blocks: exact=52088
      Buffers: shared hit=373 read=52392 written=25
      ->  Bitmap Index Scan on custom__observation__nulticol  (cost=0.00..600.20 rows=560 width=0) (actual time=65.817..65.817 rows=93331 loops=1)
          Index Cond: (((resource -> 'code'::text) -> 'coding'::text) @> '[{"code": "57485005", "system": "http://snomed.info/sct", "display": "Support"}]'::jsonb)
          Buffers: shared hit=373 read=304
Planning time: 0.143 ms
Execution time: 25453.872 ms

我还对表运行了分析命令,但看起来并没有应用所有三个索引。pgAdmin 中返回查询大约需要 30 秒

我使用“CREATE EXTENSION btree_gin”启用了 btree_gin 查询计划看起来相同,见下文

    QUERY PLAN                     
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on observation  (cost=92.19..2190.12 rows=1 width=768) (actual time=23580.396..31767.965 rows=1 loops=1)
   Recheck Cond: (((resource -> 'code'::text) -> 'coding'::text) @> '[{"code": "57485005", "system": "http://snomed.info/sct", "display": "Support"}]'::jsonb)
   Filter: (is_latest AND (NOT is_deleted) AND ((resource -> 'effectiveDateTime'::text) = to_jsonb('2019-07-31T18:07:52Z'::text)) AND (((resource -> 'subject'::text) ->> 'reference'::text) = 'Patient/e69583dc510c4ed59e362e3ac8c803b5'::text))
   Rows Removed by Filter: 93330
   Heap Blocks: exact=52088
   Buffers: shared hit=226 read=52372
   ->  Bitmap Index Scan on custom__observation__multicol  (cost=0.00..92.19 rows=559 width=0) (actual time=1304.418..1304.418 rows=93331 loops=1)
         Index Cond: (((resource -> 'code'::text) -> 'coding'::text) @> '[{"code": "57485005", "system": "http://snomed.info/sct", "display": "Support"}]'::jsonb)
         Buffers: shared hit=226 read=284
 Planning time: 1.598 ms
 Execution time: 31768.323 ms

根据下面的评论,我创建了两个索引

CREATE INDEX custom__observation__codeindex
    ON fhir.observation USING gin
    (((resource -> 'code'::text) -> 'coding'::text))
    TABLESPACE pg_default;

CREATE INDEX custom__observation__multicol
    ON fhir.observation USING btree
    (((resource -> 'subject'::text) ->> 'reference'::text) COLLATE pg_catalog."default", (resource ->> 'effectiveDateTime'::text) COLLATE pg_catalog."default")
    TABLESPACE pg_default;

查询要快得多,这是查询计划

    QUERY PLAN                                                       
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on observation  (cost=834.97..890.83 rows=1 width=767) (actual time=2599.083..3460.635 rows=1 loops=1)
   Recheck Cond: ((((resource -> 'code'::text) -> 'coding'::text) @> '[{"code": "57485005", "system": "http://snomed.info/sct", "display": "Support"}]'::jsonb) AND (((resource -> 'subject'::text) ->> 'reference'::text) = 'Patient/e69583dc510c4ed59e362e3ac8c803b5'::text))
   Filter: (is_latest AND (NOT is_deleted) AND ((resource -> 'effectiveDateTime'::text) = to_jsonb('2019-07-31T18:07:52Z'::text)))
   Rows Removed by Filter: 2331
   Heap Blocks: exact=2286
   Buffers: shared hit=217 read=2720
   ->  BitmapAnd  (cost=834.97..834.97 rows=14 width=0) (actual time=1241.916..1241.916 rows=0 loops=1)
         Buffers: shared hit=217 read=434
         ->  Bitmap Index Scan on custom__observation__codeindex  (cost=0.00..128.19 rows=559 width=0) (actual time=951.950..951.950 rows=93331 loops=1)
               Index Cond: (((resource -> 'code'::text) -> 'coding'::text) @> '[{"code": "57485005", "system": "http://snomed.info/sct", "display": "Support"}]'::jsonb)
               Buffers: shared hit=217 read=284
         ->  Bitmap Index Scan on custom__observation__multicol  (cost=0.00..706.52 rows=14130 width=0) (actual time=137.624..137.624 rows=13985 loops=1)
               Index Cond: (((resource -> 'subject'::text) ->> 'reference'::text) = 'Patient/e69583dc510c4ed59e362e3ac8c803b5'::text)
               Buffers: shared read=150
 Planning time: 5.311 ms
 Execution time: 3460.907 ms
(16 rows)

似乎这两个索引工作得更好,但多列索引没有应用有效日期时间属性,所以这基本上仍然是同一个问题,但现在有两个属性而不是三个。

标签: postgresql

解决方案


为了支持这个查询,我建议使用两个不同的索引:

支持@>运营商的一项指标:

CREATE INDEX ON fhir.observation USING gin ((resource -> 'code') -> 'coding'));

还有一个 BTree 索引来支持=标量值上的运算符:

CREATE INDEX 
   ON fhir.observation ( (resource -> 'subject' -> 'reference'), (resource -> 'effectiveDateTime') );

第二个和第三个表达式没有使用多列索引的原因是 GIN 索引只支持运算符: ??&和。?|@>

所以 GIN 索引不能用于使用=.

这就是相等条件需要第二个 BTree 索引的原因。


推荐阅读