首页 > 解决方案 > Postgres CASE 表达式 ELSE 子句即使在子句“真”时也会影响性能

问题描述

使用 PG 9.3 我有一个选择电话数据的报告查询。

如果当前用户是“管理员”,他们可以看到所有呼叫。否则,用户只能看到自己的通话。

因此我们有(简化)

create table phonecalls (phone_id int, destination varchar(100));
create table users (user_id int);
create table usergroups (user_id int, group_id int);
create table groups (group_id int, is_admin bool);
create table userphones (user_id int, phone_id int);

和以下权限条款:

SELECT * FROM phonecalls
WHERE
CASE WHEN ( SELECT is_admin FROM users join usergroups using (user_id) join groups using (group_id) WHERE user_id = 1 )
THEN true
ELSE
   exists ( SELECT phone_id FROM userphones 
            WHERE user_id = 1 
            AND userphones.phone_id = phonecalls.phone_id )
END

当数据库中有很多很多记录时,性能就是一个问题。

我发现,如果 user_id 为 1 的用户是管理员,如果我删除权限子句的 ELSE 部分,查询会加快速度,即

ELSE
   exists ( SELECT 1 )
END

但这似乎与 Postgres 文档中的以下声明相矛盾: https ://www.postgresql.org/docs/9.4/functions-conditional.html

CASE 表达式不会计算任何不需要确定结果的子表达式。

如果用户是管理员,ELSE 子句应该对查询执行时间没有影响吗?我是不是误会了?

编辑查询计划输出:

Seq Scan on phonecalls  (cost=139.44..421294.43 rows=5000 width=10) (actual time=0.071..5.598 rows=10000 loops=1)
  Filter: CASE WHEN $0 THEN true ELSE (alternatives: SubPlan 2 or hashed SubPlan 3) END
  InitPlan 1 (returns $0)
    ->  Nested Loop  (cost=36.89..139.44 rows=1538 width=1) (actual time=0.018..0.018 rows=0 loops=1)
          ->  Hash Join  (cost=36.89..80.21 rows=128 width=5) (actual time=0.018..0.018 rows=0 loops=1)
                Hash Cond: (groups.group_id = usergroups.group_id)
                ->  Seq Scan on groups  (cost=0.00..33.30 rows=2330 width=5) (actual time=0.002..0.002 rows=1 loops=1)
                ->  Hash  (cost=36.75..36.75 rows=11 width=8) (actual time=0.001..0.001 rows=0 loops=1)
                      Buckets: 1024  Batches: 1  Memory Usage: 0kB
                      ->  Seq Scan on usergroups  (cost=0.00..36.75 rows=11 width=8) (actual time=0.001..0.001 rows=0 loops=1)
                            Filter: (user_id = 1)
          ->  Materialize  (cost=0.00..40.06 rows=12 width=4) (never executed)
                ->  Seq Scan on users  (cost=0.00..40.00 rows=12 width=4) (never executed)
                      Filter: (user_id = 1)
  SubPlan 2
    ->  Seq Scan on userphones  (cost=0.00..42.10 rows=1 width=0) (never executed)
          Filter: ((user_id = 1) AND (phone_id = phonecalls.phone_id))
  SubPlan 3
    ->  Seq Scan on userphones userphones_1  (cost=0.00..36.75 rows=11 width=4) (actual time=0.009..0.010 rows=1 loops=1)
          Filter: (user_id = 1)
Total runtime: 6.229 ms

EDIT 2 'SELECT 1' 选项的查询计划

"Result  (cost=139.44..294.44 rows=10000 width=10) (actual time=0.044..3.713 rows=10000 loops=1)"
"  One-Time Filter: CASE WHEN $0 THEN true ELSE $1 END"
"  InitPlan 1 (returns $0)"
"    ->  Nested Loop  (cost=36.89..139.44 rows=1538 width=1) (actual time=0.028..0.028 rows=0 loops=1)"
"          ->  Hash Join  (cost=36.89..80.21 rows=128 width=5) (actual time=0.026..0.026 rows=0 loops=1)"
"                Hash Cond: (groups.group_id = usergroups.group_id)"
"                ->  Seq Scan on groups  (cost=0.00..33.30 rows=2330 width=5) (actual time=0.009..0.009 rows=1 loops=1)"
"                ->  Hash  (cost=36.75..36.75 rows=11 width=8) (actual time=0.000..0.000 rows=0 loops=1)"
"                      Buckets: 1024  Batches: 1  Memory Usage: 0kB"
"                      ->  Seq Scan on usergroups  (cost=0.00..36.75 rows=11 width=8) (actual time=0.000..0.000 rows=0 loops=1)"
"                            Filter: (user_id = 1)"
"          ->  Materialize  (cost=0.00..40.06 rows=12 width=4) (never executed)"
"                ->  Seq Scan on users  (cost=0.00..40.00 rows=12 width=4) (never executed)"
"                      Filter: (user_id = 1)"
"  InitPlan 2 (returns $1)"
"    ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.000..0.000 rows=1 loops=1)"
"  ->  Seq Scan on phonecalls  (cost=0.00..155.00 rows=10000 width=10) (actual time=0.012..1.502 rows=10000 loops=1)"
"Total runtime: 4.307 ms"

标签: postgresql

解决方案


区别在于FilterOne-Time Filter

在第一个查询中,CASE表达式中的条件取决于phonecalls.phone_id顺序扫描(即使从未执行该分支),因此过滤器将应用于所有 10000 个结果行。

在第二个查询中,过滤器只需要评估一次;查询是在运行InitPlan主查询之前执行的。

这 10000 张支票必须有所作为。


推荐阅读