首页 > 解决方案 > 用于大型且持续更新的表的 Postgres 慢速嵌套循环反连接

问题描述

我有两个表,ingestion_event 和 ingestion_dependency,其中第一个表中的一个或多个记录存在于第二个表中。

处理第一个表中的行时,应用程序会删除第二个表中的记录。

应用程序轮询第一个表以获取第二个表中不存在的行:

SELECT a.*, dependency_key FROM (
    SELECT e.*, dependency_key
    FROM ingestion_event e
        LEFT JOIN ingestion_dependency d ON
                e.object_key = d.object_key AND e.root_task_id = d.root_task_id
    WHERE d.object_key is null
    FOR NO KEY UPDATE of e SKIP LOCKED
) AS a 
where  a.status = 'QUEUED' 
limit 100
"Limit  (cost=0.55..579.30 rows=100 width=1560) (actual time=16199.602..71206.977 rows=100 loops=1)"
"  ->  Subquery Scan on a  (cost=0.55..1436371.23 rows=248188 width=1560) (actual time=16199.600..71206.897 rows=100 loops=1)"
"        ->  LockRows  (cost=0.55..1433889.35 rows=248188 width=1470) (actual time=16199.599..71206.771 rows=100 loops=1)"
"              ->  Nested Loop Anti Join  (cost=0.55..1431407.47 rows=248188 width=1470) (actual time=315.396..70222.109 rows=7457 loops=1)"
"                    ->  Seq Scan on ingestion_event e  (cost=0.00..424604.81 rows=917896 width=1362) (actual time=0.007..23196.192 rows=154371 loops=1)"
"                          Filter: (status = 'QUEUED'::text)"
"                          Rows Removed by Filter: 206432"
"                    ->  Index Scan using ingestion_dependency_object_key_idx on ingestion_dependency d  (cost=0.55..1.12 rows=1 width=219) (actual time=0.298..0.298 rows=1 loops=154371)"
"                          Index Cond: (e.object_key = object_key)"
"                          Filter: (e.root_task_id = root_task_id)"
"                          Rows Removed by Filter: 0"
"Planning time: 2.355 ms"
"Execution time: 71207.097 ms"

我在 ingestion_event 上有以下索引:

(root_task_id ASC NULLS LAST, object_key ASC NULLS LAST);
(status COLLATE ASC NULLS LAST);
(root_task_id ASC NULLS LAST)

我在 ingestion_dependency 列上有以下 btree 索引:

(root_task_id ASC NULLS LAST, dependency_key ASC NULLS LAST) (root_task_id, object_key) (root_task_id, object_key)

我的问题是为什么嵌套循环反连接如此昂贵?我的 postgres 版本是 9.6

标签: postgresql-9.6

解决方案


推荐阅读