postgresql - 当子选择很快时,了解为什么“从不存在中删除”会变慢
问题描述
我有一个困扰着我的查询,我找到了解决方法,但我也想了解为什么它首先会发生以从中学习一些东西。
背景: 我有一个带有“对象”的数据库,这些对象具有连接到它们的动态“属性”,用户可以搜索和过滤值。我们知道查询的最佳方式是运行单个语句,但由于可应用于每个客户端的静态和动态属性的各种过滤器的复杂性,我们选择在每个搜索查询的开头。我们从“对象”表中粗略选择并从中创建临时表。任何进一步的过滤器都会进行选择并从搜索表中删除。当没有更多过滤器可以应用时,最终的“搜索结果”会呈现给用户。
这种特殊情况会在初始搜索表中添加约 120.000 行,这是非常基本的,只有一些 ID。我们不会在搜索表上创建任何索引,因为在以前的任何情况下都没有证明这是必要的。此客户端的属性表由 ~17.000.000 行组成,其中重要的列是object_id、language_id、propertytype_id 和 stringvalue,我们有一个组合 BTREE 索引。
此查询返回相当快,从包含我们过滤值的属性表中选择所有 ID:
SELECT o."id"
FROM "object_table" o
INNER JOIN "propertyvalue_table" pt
ON o."id" = pt."object_id"
AND pt."propertytype_id" = 41
AND pt."language_id" = 1
GROUP BY o."id"
HAVING count(1) = <numberOfFilterValues>
AND array_agg(LOWER(pt."stringvalue")) @> string_to_array(lower('<filtervalue(s)>'), ',')
但是当我将该查询注入删除语句时,它突然变得非常糟糕:
DELETE FROM "temporary_search_table"
WHERE NOT EXISTS (
SELECT 1
FROM "object_table" o
INNER JOIN "propertyvalue_table" pt
ON o."id" = pt."object_id"
AND pt."propertytype_id" = 41
AND pt."language_id" = 1
GROUP BY o."id"
HAVING count(1) = <numberOfFilterValues>
AND array_agg(LOWER(pt."stringvalue")) @> string_to_array(lower('<filtervalue(s)>'), ',')
WHERE temporary_search_table."id" = o."id"
)
删除查询的解释分析需要 15 分钟才能完成,我对解释的有限理解告诉我子查询在临时表中的每行循环一次,但我不明白为什么!(分析包含一些额外的连接,我在示例中省略了这些连接以减少混乱)
Aggregate (cost=29796768.84..29796768.85 rows=1 width=8) (actual time=1234845.455..1234845.456 rows=1 loops=1)
-> Seq Scan on testsearch (cost=0.00..29796594.59 rows=69700 width=0) (actual time=74.967..1234814.642 rows=98515 loops=1)
Filter: (NOT (SubPlan 2))
Rows Removed by Filter: 30098
SubPlan 2
-> GroupAggregate (cost=195.19..213.73 rows=1 width=8) (actual time=9.600..9.600 rows=0 loops=128613)
Group Key: qo.id
Filter: ((count(1) = 1) AND (array_agg(lower((propertyvalue_1.stringvalue)::text)) @> '{inkognito}'::text[]))
Rows Removed by Filter: 0
InitPlan 1 (returns $0)
-> Index Scan using idx_propertyvalue_propertytypeid_languageid_stringvalue on propertyvalue (cost=0.81..188.51 rows=182 width=4) (actual time=0.029..43.752 rows=45356 loops=1)
Index Cond: ((propertytype_id = 41) AND (language_id = 1) AND (lower((stringvalue)::text) = ANY ('{inkognito}'::text[])))
-> Nested Loop (cost=6.69..25.21 rows=1 width=14) (actual time=9.423..9.597 rows=0 loops=128613)
-> Nested Loop (cost=6.13..8.44 rows=1 width=4) (actual time=0.009..0.011 rows=1 loops=128613)
Join Filter: (qo.type_id = qotp.type_id)
Rows Removed by Join Filter: 8
-> Index Scan using pk_object_id on object qo (cost=0.42..2.44 rows=1 width=8) (actual time=0.007..0.008 rows=1 loops=128613)
Index Cond: (id = testsearch.object_id)
-> HashAggregate (cost=5.71..5.80 rows=9 width=4) (actual time=0.000..0.001 rows=9 loops=128613)
Group Key: qotp.type_id
-> Nested Loop (cost=3.35..5.68 rows=9 width=4) (actual time=0.017..0.028 rows=9 loops=1)
-> Hash Join (cost=3.20..4.63 rows=3 width=8) (actual time=0.015..0.021 rows=3 loops=1)
Hash Cond: (qp.id = qpp.propertyset_id)
-> Seq Scan on propertyset qp (cost=0.00..1.25 rows=24 width=4) (actual time=0.002..0.005 rows=24 loops=1)
Filter: (NOT deleted)
Rows Removed by Filter: 1
-> Hash (cost=3.17..3.17 rows=3 width=4) (actual time=0.006..0.006 rows=3 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Index Scan using fki_propertyset_propertytype_propertytype_id on propertyset_propertytype qpp (cost=0.14..3.17 rows=3 width=4) (actual time=0.003..0.005 rows=3 loops=1)
Index Cond: (propertytype_id = 41)
-> Index Scan using fki_object_type_propertyset_propertyset_id on object_type_propertyset qotp (cost=0.14..0.32 rows=3 width=8) (actual time=0.001..0.002 rows=3 loops=3)
Index Cond: (propertyset_id = qp.id)
-> Index Scan using pk_propertyvalue on propertyvalue propertyvalue_1 (cost=0.56..16.76 rows=1 width=14) (actual time=1.269..1.444 rows=0 loops=128613)
Index Cond: ((object_id = ANY ($0)) AND (object_id = testsearch.object_id) AND (propertytype_id = 41) AND (language_id = 1))
我通常求助于 CTE 来解决这些我对 postgresql 不够了解的情况,并且下面的工作就像我假设上面的语句一样。
WITH relevant_object_ids AS (
SELECT o."id"
FROM "object_table" o
INNER JOIN "propertyvalue_table" pt
ON o."id" = pt."object_id"
AND pt."propertytype_id" = 41
AND pt."language_id" = 1
GROUP BY o."id"
HAVING count(1) = <numberOfFilterValues>
AND array_agg(LOWER(pt."stringvalue")) @> string_to_array(lower('<filtervalue(s)>'), ',')
)
DELETE FROM "temporary_search_table"
WHERE NOT EXISTS (
SELECT 1
FROM relevant_object_ids
WHERE temporary_search_table."id = relevant_object_ids."id"
)
解决方案
推荐阅读
- regex - 正则表达式忽略几组符号中的所有内容
- python - 用 pygame 和 pyserial 控制电机
- flutter - 用路径在 Flutter 中绘制复杂元素,然后以不同的角度重绘相同的元素
- python - Python模拟树概率
- javascript - Firebase“运行此应用程序的环境不支持操作”
- java - 使用一种方法从文本文件中读取数字并返回素数
- c# - MenuItem 中的单击事件
- c++ - 是`boost::function
f`是指针还是对象? - javascript - 删除列 css 网格中的间隙
- angular - 在 firestore 中更新文档会使结果列表变为初始形式