首页 > 解决方案 > Postgresql 索引不用于>30s 的慢查询

问题描述

POSTGRESQL 版本:10 硬件:4 个工人 / 16GBRAM / 50% 已使用

我不是 Postgresql 专家。我刚刚阅读了很多文档并做了很多测试。由于表上有 1000 万行,我有一些 postgresql 查询需要很多时间 > 30 秒。

            Column            |           Type           | Collation | Nullable |                         Default                          
------------------------------+--------------------------+-----------+----------+----------------------------------------------------------
 id                           | integer                  |           | not null | 
 cveid                        | character varying(50)    |           |          | 
 summary                      | text                     |           | not null | 
 published                    | timestamp with time zone |           |          | 
 modified                     | timestamp with time zone |           |          | 
 assigner                     | character varying(128)   |           |          | 
 vulnerable_products          | character varying(250)[] |           |          | 
 cvss                         | double precision         |           |          | 
 cvss_time                    | timestamp with time zone |           |          | 
 cvss_vector                  | character varying(250)   |           |          | 
 access                       | jsonb                    |           | not null | 
 impact                       | jsonb                    |           | not null | 
 score                        | integer                  |           | not null | 
 is_exploitable               | boolean                  |           | not null | 
 is_confirmed                 | boolean                  |           | not null | 
 is_in_the_news               | boolean                  |           | not null | 
 is_in_the_wild               | boolean                  |           | not null | 
 reflinks                     | jsonb                    |           | not null | 
 reflinkids                   | jsonb                    |           | not null | 
 created_at                   | timestamp with time zone |           |          | 
 history_id                   | integer                  |           | not null | nextval('vulns_historicalvuln_history_id_seq'::regclass)
 history_date                 | timestamp with time zone |           | not null | 
 history_change_reason        | character varying(100)   |           |          | 
 history_type                 | character varying(1)     |           | not null |
Indexes:
    "vulns_historicalvuln_pkey" PRIMARY KEY, btree (history_id)
    "btree_varchar" btree (history_type varchar_pattern_ops)
    "vulns_historicalvuln_cve_id_850876bb" btree (cve_id)
    "vulns_historicalvuln_cwe_id_2013d697" btree (cwe_id)
    "vulns_historicalvuln_history_user_id_9e25ebf5" btree (history_user_id)
    "vulns_historicalvuln_id_773f2af7" btree (id)
--- TRUNCATE
Foreign-key constraints:
    "vulns_historicalvuln_history_user_id_9e25ebf5_fk_custusers" FOREIGN KEY (history_user_id) REFERENCES custusers_user(id) DEFERRABLE INITIALLY DEFERRED

查询示例:SELECT * FROM vulns_historicalvuln WHERE history_type <> '+' order by id desc 仅获取前 10000 行;-> 30s 没有缓存

查询计划

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.43..31878.33 rows=10000 width=1736) (actual time=0.173..32839.474 rows=10000 loops=1)
   ->  Index Scan Backward using vulns_historicalvuln_id_773f2af7 on vulns_historicalvuln  (cost=0.43..26346955.92 rows=8264960 width=1736) (actual time=0.172..32830.958 rows=10000 loops=1)
         Filter: ((history_type)::text <> '+'::text)
         Rows Removed by Filter: 296
 Planning time: 19.514 ms
 Execution time: 32845.015 ms

SELECT DISTINCT "vulns"."id", "vulns"."uuid", "vulns"."feedid", "vulns"."cve_id", "vulns"."cveid", "vulns"."summary", "vulns"."published", "vulns"."modified", "vulns"."assigner", "vulns"."cwe_id", "vulns"."vulnerable_packages_versions", "vulns"."vulnerable_products", "vulns" "."vulnerable_product_versions", "vulns"."cvss", "vulns"."cvss_time", "vulns"."cvss_version", "vulns"."cvss_vector", "vulns"."cvss_metrics", "vulns"。 “访问”,“漏洞”。“影响”,“vulns"."cvss3", "vulns"."cvss3_vector", "vulns"."cvss3_version", "vulns"."cvss3_metrics", "vulns"."score", "vulns"."is_exploitable", "vulns" .“is_confirmed”,“vulns”。“is_in_the_news”,“vulns”。“is_in_the_wild”,“vulns”。“reflinks”,“vulns”。“reflinkids”,“vulns”。“created_at”,“vulns”。 updated_at", "vulns"."id" AS "exploit_count", false AS "monitored", '42' AS "org" FROM "vulns" WHERE ("vulns"."score" >= 0 AND "vulns"."分数“<= 100)按“vulns”排序。“updated_at”DESC 限制 10


查询计划

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=315191.32..315192.17 rows=10 width=1691) (actual time=3013.964..3013.990 rows=10 loops=1)
   ->  Unique  (cost=315191.32..329642.42 rows=170013 width=1691) (actual time=3013.962..3013.986 rows=10 loops=1)
         ->  Sort  (cost=315191.32..315616.35 rows=170013 width=1691) (actual time=3013.961..3013.970 rows=10 loops=1)
               Sort Key: updated_at DESC, id, uuid, feedid, cve_id, cveid, summary, published, modified, assigner, cwe_id, vulnerable_packages_versions, vulnerable_products, vulnerable_product_versions, cvss, cvss_time, cvss_version, cvss_vector, cvss_metrics, access, impact, cvss3, cvss3_vector, cvss3_version, cvss3_metrics, score, is_exploitable, is_confirmed, is_in_the_news, is_in_the_wild, reflinks, reflinkids, created_at
               Sort Method: external merge  Disk: 277648kB
               ->  Seq Scan on vulns  (cost=0.00..50542.19 rows=170013 width=1691) (actual time=0.044..836.597 rows=169846 loops=1)
                     Filter: ((score >= 0) AND (score <= 100))
 Planning time: 3.183 ms
 Execution time: 3070.346 ms

我创建了一个btree_varchar" btree (history_type varchar_pattern_ops)像这样的 btree varchar 索引: CREATE INDEX CONCURRENTLY btree_varchar ON vulns_historicalvuln (history_type varchar_pattern_ops);

我还为我的第二个查询创建了一个 vulns 分数索引:CREATE INDEX CONCURRENTLY ON vulns (score);

我阅读了很多关于慢查询和索引的帖子和文档。我确定这是关于慢查询的解决方案,但 Postgresql 的查询计划不使用我创建的索引。它估计它使用 seq 扫描比使用索引处理得更快......

SELECT relname, indexrelname, idx_scan FROM   pg_catalog.pg_stat_user_indexes;

               relname               |                          indexrelname                           |  idx_scan  
-------------------------------------+-----------------------------------------------------------------+------------
 vulns_historicalvuln                | btree_varchar                                                   |          0

你能告诉我我的索引是否设计得很好吗?我如何调试它,如果需要,请随时询问更多信息。

谢谢

标签: postgresql

解决方案


经过一番研究,我明白索引不是我的问题的解决方案。该字段的低基数(重复值)使索引无用。

这里查询postgresql的时间是正常的,因为匹配了30M行。我关闭了这个问题,因为这里的索引没有问题。


推荐阅读