首页 > 解决方案 > 在 Postgres 中获得较高的查询执行时间

问题描述

Table: Customer

Type:

telephone1  | character varying(255)
telephone2  | character varying(255)
location_id | integer 

Index:

"idx_customers_location_id" btree (location_id)
"idx_customers_telephone1_txt" btree (telephone1 text_pattern_ops)
"idx_customers_trim_telephone_1" btree (btrim(telephone1::text))
"idx_customers_trim_telephone2" btree (btrim(telephone2::text))

我有一个名为 customers 的表,总行数为141182。我正在检查两列(telephone1,telephone2)中的值,所有列telephone1都有数据,但只有8行有telephone2列的值

当我检查值 1时,得到低于执行时间的值。

SELECT  customers.id, location_id, telephone1, telephon2 FROM "customers" INNER JOIN "locations" ON 
"locations"."id" = "customers"."location_id" WHERE (customers.location_id = 189 AND (telephone1 = '1'
OR telephone2 = '1')) GROUP BY customers.id LIMIT 20 OFFSET 0;


Limit  (cost=519.62..519.64 rows=4 width=125) (actual time=25.895..25.898 rows=1 loops=1)
   ->  GroupAggregate  (cost=519.62..519.64 rows=4 width=125) (actual time=25.893..25.896 rows=1 loops=1)
         Group Key: customers.id
         ->  Sort  (cost=519.62..519.62 rows=4 width=127) (actual time=25.876..25.879 rows=1 loops=1)
               Sort Key: customers.id
               Sort Method: quicksort  Memory: 25kB
               ->  Nested Loop  (cost=8.62..519.61 rows=4 width=127) (actual time=10.740..25.869 rows=1 loops=1)
                     ->  Index Scan using locations_pkey on locations  (cost=0.06..4.06 rows=1 width=70) (actual time=0.027..0.029 rows=1 loops=1)
                           Index Cond: (id = 189)
                     ->  Bitmap Heap Scan on customers  (cost=8.56..515.54 rows=4 width=61) (actual time=10.707..25.832 rows=1 loops=1)
                           Recheck Cond: (((telephone1)::text = '1'::text) OR ((telephone2)::text = '1'::text))
                           Filter: (location_id = 189)
                           Rows Removed by Filter: 1048
                           Heap Blocks: exact=1737
                           ->  BitmapOr  (cost=8.56..8.56 rows=259 width=0) (actual time=3.445..3.446 rows=0 loops=1)
                                 ->  Bitmap Index Scan on idx_customers_telephone1_txt  (cost=0.00..2.10 rows=7 width=0) (actual time=0.065..0.066 rows=99 loops=1)
                                       Index Cond: ((telephone1)::text = '1'::text)
                                 ->  Bitmap Index Scan on idx_customers_telephone2_txt  (cost=0.00..6.47 rows=253 width=0) (actual time=3.378..3.378 rows=1664 loops=1)
                                       Index Cond: ((telephone2)::text = '1'::text)
 Planning Time: 0.419 ms
 Execution Time: 25.995 ms

当我检查值 0时,执行时间( 7753.216 ms)发生了巨大变化

Limit  (cost=0.14..2440.90 rows=10 width=125) (actual time=5900.924..7753.133 rows=4 loops=1)
   ->  GroupAggregate  (cost=0.14..292402.20 rows=1198 width=125) (actual time=5900.922..7753.129 rows=4 loops=1)
         Group Key: customers.id
         ->  Nested Loop  (cost=0.14..292395.61 rows=1198 width=127) (actual time=4350.358..7753.087 rows=4 loops=1)
               ->  Index Scan using customers_pkey on customers  (cost=0.09..292387.36 rows=1198 width=61) (actual time=4350.338..7753.054 rows=4 loops=1)
                     Filter: ((location_id = 189) AND (((telephone1)::text = '0'::text) OR ((telephone2)::text = '0'::text)))
                     Rows Removed by Filter: 8484280
               ->  Materialize  (cost=0.06..4.06 rows=1 width=70) (actual time=0.005..0.005 rows=1 loops=4)
                     ->  Index Scan using locations_pkey on locations  (cost=0.06..4.06 rows=1 width=70) (actual time=0.013..0.013 rows=1 loops=1)
                           Index Cond: (id = 189)
 Planning Time: 0.322 ms
 Execution Time: 7753.216 ms

是否有任何特殊原因需要更多时间来执行值 0?或者这里有什么问题?

还有一件事我注意到这个问题只发生在telephone2 栏上。

标签: postgresqlquery-optimization

解决方案


但只有 8 行具有列电话 2 的值

您的解释计划另有说明,找到 1664 行具有一个特定值的电话 2。现在可能其中大部分都不可见,但在这种情况下,您确实需要对表进行 VACUUM ANALYZE。

Nested Loop  (cost=0.14..292395.61 rows=1198 width=127) (actual time=4350.358..7753.087 rows=4 loops=1)

对于第二个查询,它认为它将找到 1198 行(如果运行完成)。但它认为它可以在前 20 个之后停止,所以这将是指数的 1.67%。但实际上只有 4 行,所以它出乎意料地必须扫描整个索引而不能提前停止。

为什么估计差这么多?我不知道,它可能只是过时的统计数据(再次,VACUUM ANALYZE table),或者列之间可能存在一些相互关系,即使使用准确的统计数据也很难进行估计。

加入地点有什么意义?


推荐阅读