首页 > 解决方案 > JOIN 比 UNION 慢得多,即使有索引

问题描述

我有一个查询:

SELECT study."id"
FROM study
JOIN report ON (report."studyId" = study."id")
WHERE 
study.facts->'patientName'->>'value' = 'HELLO WORLD' OR 
report.variables->'patientName'->>'value' = 'HELLO WORLD'

所有表都有索引。

为什么这个查询在 6000 行中需要 4.5 秒?在下面解释分析输出:

+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                                                                                                 |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Hash Join  (cost=383.69..1403.67 rows=39 width=52) (actual time=2734.257..2734.260 rows=0 loops=1)                                                                                         |
|   Hash Cond: ((study.id)::text = (report."studyId")::text)                                                                                                                                 |
|   Join Filter: ((((study.facts -> 'patientName'::text) ->> 'value'::text) = 'HELLO WORLD'::text) OR (((report.variables -> 'patientName'::text) ->> 'value'::text) = 'HELLO WORLD'::text)) |
|   Rows Removed by Join Filter: 7453                                                                                                                                                        |
|   ->  Seq Scan on study  (cost=0.00..1000.23 rows=7523 width=70) (actual time=0.020..13.548 rows=7523 loops=1)                                                                             |
|   ->  Hash  (cost=290.53..290.53 rows=7453 width=70) (actual time=5.052..5.053 rows=7453 loops=1)                                                                                          |
|         Buckets: 8192  Batches: 1  Memory Usage: 808kB                                                                                                                                     |
|         ->  Seq Scan on report  (cost=0.00..290.53 rows=7453 width=70) (actual time=0.014..3.235 rows=7453 loops=1)                                                                        |
| Planning Time: 0.896 ms                                                                                                                                                                    |
| Execution Time: 2734.323 ms                                                                                                                                                                |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

我有一个 UNION 查询,它做同样的事情,但速度更快(0.001 秒)。我想更多地了解为什么我的 JOIN 查询慢得多:

SELECT id::text
FROM study
WHERE study.facts->'patientName'->>'value' = 'HELLO WORLD'
UNION
SELECT report."studyId"::text
FROM report
WHERE report.variables->'patientName'->>'value' = 'HELLO WORLD';
+-------------------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                                                      |
|-------------------------------------------------------------------------------------------------------------------------------------------------|
| HashAggregate  (cost=143.12..143.51 rows=39 width=32) (actual time=0.040..0.041 rows=0 loops=1)                                                 |
|   Group Key: ((study.id)::text)                                                                                                                 |
|   ->  Append  (cost=4.58..143.02 rows=39 width=32) (actual time=0.038..0.039 rows=0 loops=1)                                                    |
|         ->  Bitmap Heap Scan on study  (cost=4.58..134.14 rows=38 width=32) (actual time=0.026..0.026 rows=0 loops=1)                           |
|               Recheck Cond: (((facts -> 'patientName'::text) ->> 'value'::text) = 'HELLO WORLD'::text)                                          |
|               ->  Bitmap Index Scan on "IDX_facts_patientName"  (cost=0.00..4.57 rows=38 width=0) (actual time=0.023..0.023 rows=0 loops=1)     |
|                     Index Cond: (((facts -> 'patientName'::text) ->> 'value'::text) = 'HELLO WORLD'::text)                                      |
|         ->  Index Scan using "IDX_variables_patientName" on report  (cost=0.28..8.30 rows=1 width=32) (actual time=0.012..0.012 rows=0 loops=1) |
|               Index Cond: (((variables -> 'patientName'::text) ->> 'value'::text) = 'HELLO WORLD'::text)                                        |
| Planning Time: 0.560 ms                                                                                                                         |
| Execution Time: 0.103 ms                                                                                                                        |
+-------------------------------------------------------------------------------------------------------------------------------------------------+

标签: sqlpostgresql

解决方案


JOINS 和 UNIONS 是两个完全不同的操作。

  • JOIN将两个表中的列添加到结果集中,将两个表与您的匹配条件匹配 ( WHERE CLAUSE)

  • 另一方面, UNION将 1 个表的结果附加到另一个表上。UNION更详细地说, an和 a之间的区别在于UNION ALL联合是SELECT DISTINCT来自UNION ALL

总之,UNION不需要匹配标准来选择ROWS要添加到结果中的 。


推荐阅读