首页 > 解决方案 > 与 JSONB 列上的 GIN INDEX 的搜索性能类似

问题描述

我对 Postgres 比较陌生。

我有一个 JSONB 列,其中包含如下示例数据。

{
    "book_data": {
        "author": "abcd",
        "title": "This is a literature book",
        "year": "2021",
        "price":2000,
        "noofpages":100,
        "subject": "Language"
    }
}

在这里,用户可以按作者、年份、价格、主题、标题的值进行搜索。当用户按标题搜索时,他们可以进行部分或全部搜索。所以在这里用户可以搜索整个标题(“这是一本文学书”)或部分(比如“文学”)我们在列上有一个 GIN 索引。当我们对值进行完全匹配时,查询性能良好。

select * FROM book_ms.book_data a
WHERE a.book_details  @@ '$.book_data.author == "abcd" 
&& $.book_data.title == "This is a litreature book"'

但是当进行如下部分匹配时,需要很长时间。我将仅对标题进行部分搜索。对于其他人,它将始终是完全匹配的。有什么办法可以提高点赞搜索的速度

select * FROM book_ms.book_data a
WHERE a.book_details  @@ '$.book_data.author == "abcd" 
&& $.book_data.title like_regex "litreature"'

我必须为此寻找文本索引吗?如果是这样,我们是否有一些与 oracle 文本索引中使用的节组等效的东西

下面解释(分析,缓冲区) - 对于类似的匹配

"Bitmap Heap Scan on book_data a  (cost=71.57..1784.58 rows=461 width=935) (actual time=63.726..600.494 rows=3430 loops=1)"
"  Recheck Cond: (book_details @@ '($.""book_data"".""author"" == ""abcd"" && $.""book_data"".""title"" like_regex ""litreature"")'::jsonpath)"
"  Rows Removed by Index Recheck: 198778"
"  Heap Blocks: exact=30016"
"  Buffers: shared hit=30241"
"  ->  Bitmap Index Scan on search_ult_text_ndx_3  (cost=0.00..71.45 rows=461 width=0) (actual time=57.008..57.009 rows=202208 loops=1)"
"        Index Cond: (book_details @@ '($.""book_data"".""author"" == ""abcd"" && $.""book_data"".""title"" like_regex ""litreature"")'::jsonpath)"
"        Buffers: shared hit=225"
"Planning Time: 0.216 ms"
"Execution Time: 601.197 ms"
"Note: This is not an Approved plan.  No usable Approved plan was found."
"SQL Hash: -403339798, Plan Hash: -176285915"

下面是精确匹配

"Bitmap Heap Scan on book_data a  (cost=135.57..1848.58 rows=461 width=935) (actual time=23.597..23.707 rows=25 loops=1)"
"  Recheck Cond: (book_details @@ '($.""book_data"".""author"" == ""abcd"" && $.""book_data"".""title"" == ""This is a literature book"")'::jsonpath)"
"  Heap Blocks: exact=24"
"  Buffers: shared hit=137 read=17"
"  I/O Timings: read=21.189"
"  ->  Bitmap Index Scan on search_ult_text_ndx_3  (cost=0.00..135.45 rows=461 width=0) (actual time=23.572..23.572 rows=25 loops=1)"
"        Index Cond: (book_details @@ '($.""book_data"".""author"" == ""abcd"" && $.""book_data"".""title"" == ""This is a literature book"")'::jsonpath)"
"        Buffers: shared hit=113 read=17"
"        I/O Timings: read=21.189"
"Planning Time: 0.207 ms"
"Execution Time: 23.732 ms"
"Note: This is not an Approved plan.  No usable Approved plan was found."
"SQL Hash: -403339798, Plan Hash: -176285915"

标签: postgresqlindexingsql-likejsonbregexp-like

解决方案


没有索引可以加速这种WHERE情况。

如果路径总是{book_data,author},你可以写:

WHERE book_details -> 'book_data' ->> 'author' LIKE '%litreature%'

book_details -> 'book_data' ->> 'author'然后可以使用三元组索引:

CREATE EXTENSION IF NOT EXISTS pg_trgm;

CREATE INDEX ON book_ms.book_data USING gin
   ((book_details -> 'book_data' ->> 'author') gin_trgm_ops);

推荐阅读