首页 > 解决方案 > 具有慢速 remote_scan 查询的 Postgres Citus

问题描述

我在 Citus 上运行一个需要几个连接的查询。

https://explain.dalibo.com/plan/xCH

让我们不要考虑查询是 EAV 模型(实体属性值)这一事实,而是 sort/remote_scan 只需 100 个元素就需要大约 30 秒的事实。

PS:我也在没有 EAV 模型的情况下运行它,并且遇到了相同的 remote_scan 问题。

这对我来说看起来有点慢,因为其余的似乎都非常快。


Limit  (cost=3821.93..3822.18 rows=100 width=64) (actual time=31798.709..31798.728 rows=100 loops=1)
  Output: remote_scan."?column?", remote_scan."?column?_1"
  ->  Sort  (cost=3821.93..4071.93 rows=100000 width=64) (actual time=31798.707..31798.715 rows=100 loops=1)
        Output: remote_scan."?column?", remote_scan."?column?_1"
        Sort Key: remote_scan."?column?_1", remote_scan."?column?"
        Sort Method: top-N heapsort  Memory: 35kB
        ->  Custom Scan (Citus Adaptive)  (cost=0.00..0.00 rows=100000 width=64) (actual time=31794.489..31794.942 rows=3200 loops=1)
              Output: remote_scan."?column?", remote_scan."?column?_1"
              ->  Distributed Subplan 13_1
                    ->  Custom Scan (Citus Adaptive)  (cost=0.00..0.00 rows=100000 width=274) (actual time=4.761..4.764 rows=26 loops=1)
                          Output: remote_scan.entity_id
                          Task Count: 32
                          Tasks Shown: One of 32
                          ->  Task
                                Node: host=dbhost port=5433 dbname=EAVDB
                                ->  Index Scan using objects_eav_attribute_group_idx_102040 on public.objects_eav_102040 objects_eav  (cost=0.56..3141.90 rows=388 width=21) (actual time=0.052..0.126 rows=1 loops=1)
                                      Output: entity_id
                                      Index Cond: ((objects_eav.attribute_group)::text = ''Author''::text)
                                      Filter: ((objects_eav.json_value ->> ''UnixName''::text) = ANY (''{Brian}''::text[]))
                                      Rows Removed by Filter: 134
                                      Buffers: shared hit=58
                                    Planning Time: 0.060 ms
                                    Execution Time: 0.137 ms
                    Planning Time: 0.000 ms
                    Execution Time: 4.787 ms
              ->  Distributed Subplan 13_2
                    ->  Custom Scan (Citus Adaptive)  (cost=0.00..0.00 rows=100000 width=274) (actual time=10.329..10.329 rows=0 loops=1)
                          Output: remote_scan.from_entity_id
                          Task Count: 32
                          Tasks Shown: One of 32
                          ->  Task
                                Node: host=dbhost port=5433 dbname=EAVDB
                                ->  Nested Loop  (cost=1.25..3631.28 rows=1 width=21) (actual time=0.017..0.017 rows=0 loops=1)
                                      Output: usertag.from_entity_id
                                      Buffers: shared hit=10
                                      ->  Index Scan using objects_eav_attribute_group_idx_102040 on public.objects_eav_102040 tag  (cost=0.56..3052.29 rows=15 width=21) (actual time=0.017..0.017 rows=0 loops=1)
                                            Output: tag.entity_id, tag.attribute_group, tag.attribute_version, tag.json_value, tag.changed_json_fields, tag.created_by, tag.created
                                            Index Cond: ((tag.attribute_group)::text = ''Tag''::text)
                                            Filter: ((tag.json_value ->> ''Name''::text) = ''Access''::text)
                                            Rows Removed by Filter: 6
                                            Buffers: shared hit=10
                                      ->  Index Scan using entity_relation_pkey_102072 on public.entity_relation_102072 usertag  (cost=0.69..38.59 rows=1 width=21) (never executed)
                                            Output: usertag.from_entity_id
                                            Index Cond: (((usertag.from_entity_id)::text = (tag.entity_id)::text) AND ((usertag.relation_type)::text = ''UserTag''::text))
                                            Filter: ((usertag.json_value ->> ''Value''::text) = ''private''::text)
                                    Planning Time: 0.315 ms
                                    Execution Time: 0.035 ms
                    Planning Time: 0.000 ms
                    Execution Time: 10.346 ms
              Task Count: 32
              Tasks Shown: One of 32
              ->  Task
                    Node: host=dbhost port=5433 dbname=EAVDB
                    ->  Limit  (cost=1099356.66..1099356.67 rows=1 width=64) (actual time=1917.101..1917.111 rows=100 loops=1)
                          Output: ((a.json_value ->> ''ID''::text)), ((a.json_value ->> ''RegistrationDate''::text))
                          Buffers: shared hit=614084 read=264 dirtied=2
                          ->  Sort  (cost=1099356.66..1099356.67 rows=1 width=64) (actual time=1917.100..1917.104 rows=100 loops=1)
                                Output: ((a.json_value ->> ''ID''::text)), ((a.json_value ->> ''RegistrationDate''::text))
                                Sort Key: ((a.json_value ->> ''RegistrationDate''::text)), ((a.json_value ->> ''ID''::text))
                                Sort Method: top-N heapsort  Memory: 37kB
                                Buffers: shared hit=614084 read=264 dirtied=2
                                ->  Nested Loop Semi Join  (cost=1001.93..1099356.65 rows=1 width=64) (actual time=0.999..1911.488 rows=8222 loops=1)
                                      Output: (a.json_value ->> ''ID''::text), (a.json_value ->> ''RegistrationDate''::text)
                                      Buffers: shared hit=614084 read=264 dirtied=2
                                      ->  Nested Loop  (cost=1001.13..1099349.91 rows=1 width=395) (actual time=0.905..1578.191 rows=14169 loops=1)
                                            Output: a.json_value, a.entity_id, aei.entity_id
                                            Inner Unique: true
                                            Buffers: shared hit=535024 read=262 dirtied=2
                                            ->  Gather  (cost=1000.56..1099345.22 rows=1 width=21) (actual time=0.316..130.349 rows=64054 loops=1)
                                                  Output: aei.entity_id
                                                  Workers Planned: 2
                                                  Workers Launched: 2
                                                  Buffers: shared hit=214973 read=43 dirtied=2
                                                  ->  Parallel Index Scan using objects_eav_attribute_group_idx_102040 on public.objects_eav_102040 aei  (cost=0.56..1098345.12 rows=1 width=21) (actual time=0.370..541.166 rows=21351 loops=3)
                                                        Output: aei.entity_id
                                                        Index Cond: ((aei.attribute_group)::text = ''UserExtraInfo''::text)
                                                        Filter: ((((aei.json_value ->> ''Stat5''::text))::double precision > ''0.1''::double precision) AND (((aei.json_value ->> ''Stat5''::text))::double precision <= ''1000''::double precision) AND (((aei.json_value ->> ''NLogin''::text))::integer = 0) AND (((aei.json_value ->> ''Views''::text))::integer = 0) AND (((aei.json_value ->> ''Stat2''::text))::integer = 0) AND (((aei.json_value ->> ''Stat3''::text))::integer = 0) AND (((aei.json_value ->> ''Stat4''::text))::integer = 1))
                                                        Rows Removed by Filter: 494566
                                                        Buffers: shared hit=214973 read=43 dirtied=2
                                                        Worker 0: actual time=0.146..1461.372 rows=60358 loops=1
                                                          Buffers: shared hit=190199 read=41 dirtied=2
                                                        Worker 1: actual time=0.895..151.434 rows=3389 loops=1
                                                          Buffers: shared hit=19233 read=2
                                            ->  Index Scan using objects_eav_pkey_102040 on public.objects_eav_102040 a  (cost=0.57..2.62 rows=1 width=374) (actual time=0.024..0.024 rows=0 loops=64054)
                                                  Output: a.entity_id, a.attribute_group, a.attribute_version, a.json_value, a.changed_json_fields, a.created_by, a.created
                                                  Index Cond: (((a.entity_id)::text = (aei.entity_id)::text) AND ((a.attribute_group)::text = ''User''::text))
                                                  Filter: ((NOT (hashed SubPlan 1)) AND ((a.json_value ->> ''RegistrationDate''::text) > ''2006-01-01''::text) AND ((a.json_value ->> ''RegistrationDate''::text) <= ''2020-06-01''::text) AND ((a.json_value ->> ''Referral''::text) = ANY (''{Google,Facebook}''::text[])) AND ((a.json_value ->> ''UserType''::text) = ''normal''::text) AND ((a.json_value ->> ''Status''::text) = ''Online''::text) AND ((a.json_value ->> ''LastLogin''::text) = ''1d''::text) AND ((a.json_value ->> ''Region''::text) = ''USA''::text))
                                                  Rows Removed by Filter: 1
                                                  Buffers: shared hit=320051 read=219
                                                  SubPlan 1
                                                    ->  Function Scan on pg_catalog.read_intermediate_result intermediate_result_1  (cost=0.00..0.00 rows=1 width=274) (actual time=0.047..0.047 rows=0 loops=1)
                                                          Output: intermediate_result_1.from_entity_id
                                                          Function Call: read_intermediate_result(''13_2''::text, ''binary''::citus_copy_format)
                                      ->  Hash Semi Join  (cost=0.80..6.30 rows=44 width=21) (actual time=0.023..0.023 rows=1 loops=14169)
                                            Output: amp.from_entity_id
                                            Hash Cond: ((amp.to_entity_id)::text = (intermediate_result.entity_id)::text)
                                            Buffers: shared hit=79060 read=2
                                            ->  Index Only Scan using entity_relation_pkey_102072 on public.entity_relation_102072 amp  (cost=0.69..5.47 rows=89 width=42) (actual time=0.022..0.022 rows=4 loops=14169)
                                                  Output: amp.from_entity_id, amp.relation_type, amp.to_entity_id, amp.to_entity_id_2
                                                  Index Cond: ((amp.from_entity_id = (a.entity_id)::text) AND (amp.relation_type = ''UserMaintainer''::text))
select a.json_value ->> 'ID', a.json_value ->> 'RegistrationDate'
from objects_eav a
where a.attribute_group = 'User'
  and a.json_value ->> 'UserType' in ('normal')
  and a.json_value ->> 'Status' in ('Online')
  and a.json_value ->> 'LastLogin' = '1d'
  and a.json_value ->> 'RegistrationDate' > '2006-01-01'
  and a.json_value ->> 'Region' in ('USA')
  and a.json_value ->> 'RegistrationDate' <= '2020-06-01'
  and a.json_value ->> 'Referral' in ('Google', 'Facebook')
  and a.entity_id in (select aei.entity_id
                      from objects_eav aei
                      where aei.attribute_group = 'UserExtraInfo'
                        and (aei.json_value ->> 'NLogin')::int = 0
                        and (aei.json_value ->> 'Views')::int in (0)
                        and (aei.json_value ->> 'Stat2')::int in (0)
                        and (aei.json_value ->> 'Stat3')::int in (0)
                        and (aei.json_value ->> 'Stat4')::int in (1)
                        and (aei.json_value ->> 'Stat5')::float > 0.100000
                        and (aei.json_value ->> 'Stat5')::float <= 1000.000000)
  and a.entity_id not in (select UserTag.from_entity_id
                          from entity_relation UserTag
                                   join objects_eav Tag on
                                  Tag.attribute_group = 'Tag'
                                  and UserTag.from_entity_id = Tag.entity_id
                          where UserTag.relation_type = 'UserTag'
                            and Tag.json_value ->> 'Name' = 'Access'
                            and UserTag.json_value ->> 'Value' = 'private')
  and a.entity_id in (select amp.from_entity_id
                      from entity_relation amp
                      where amp.relation_type = 'UserMaintainer'
                        and amp.to_entity_id in (select entity_id
                                                 from objects_eav
                                                 where attribute_group = 'Author'
                                                   and json_value ->> 'UnixName' in
                                                       ('Brian')))
ORDER BY a.json_value ->> 'RegistrationDate', a.json_value ->> 'ID'
limit 100;


标签: postgresqlquery-optimizationcitus

解决方案


推荐阅读