首页 > 解决方案 > 修复 SQL Query 上的表连接导致查询未运行

问题描述

我正在 Teradata 处理旧版 SQL 查询,该查询用于制作将数据推送到下游数据分析应用程序的临时表。我是这项工作的新手,在快速查看表后,我注意到查询在连接条件中引用同一个表的错误。修复后,我重新运行查询,期待更好的结果。

该数据库拥有数十亿条记录。在修复查询之前,原始临时表包含大约 250 万行,运行大约需要 5 秒。修复查询后,查询将运行数小时而没有任何进展迹象,然后通常会使应用程序崩溃。

原始查询如下。我在我“修复”连接的地方发表了评论。

我只运行了 select 语句,sample 10只是用来查看是否可以拉回任何数据,但同样,它需要永远运行。

关于为什么查询没有执行的任何想法?或者如何提高性能以查看它是否真的有效?甚至如何检查引擎盖下是否发生了什么?

create multiset volatile table order_date as
(
SELECT a.ord_nbr,
a.ord_spot_cnt AS ord_spot_cnt,
a.ord_totl_amt AS ord_totl_amt,
a.eclipse_regn_nm AS eclipse_regn_nm,
a.sale_offc_key AS sales_office,
so.sale_offc_nm AS sales_office_name,
a.sale_prsn_key AS sales_person,
sp.sale_prsn_nm AS sales_person_name,
trunc(a.ctrc_begn_dt, 'month') ctrc_mnth,
(cast(a.ctrc_end_dt AS date format 'yyyy/mm/dd') - cast(a.ctrc_begn_dt AS date format 'yyyy/mm/dd')) AS ctrc_duration,
CASE WHEN extract (month FROM a.ctrc_begn_dt) in (10,11) THEN 1
WHEN extract (month FROM a.ctrc_begn_dt) in (12) AND extract (day FROM a.ctrc_begn_dt) <= 22 THEN 1 ELSE 0 END AS first_look_flg,
CASE WHEN extract (month FROM a.ctrc_begn_dt) in (12) AND extract (day FROM a.ctrc_begn_dt) > 22 THEN 2
WHEN extract (month FROM a.ctrc_begn_dt) in (1) THEN 2
WHEN extract (month FROM a.ctrc_begn_dt) in (2) AND extract (day FROM a.ctrc_begn_dt) <= 14 THEN 1 ELSE 0 END AS second_look_flg
FROM PRD_AM_BI_1.am_order_eda_fact a
inner join prd_am_bi_1.AM_SALES_OFFICE_EDA_DIM so
on (a.SALE_OFFC_KEY = so.SALE_OFFC_KEY
and so.ECLIPSE_REGN_NM = so.ECLIPSE_REGN_NM)
-- I 'fixed' the above line to "and a.ECLIPSE_REGN_NM = so.ECLIPSE_REGN_NM)"
inner join prd_am_bi_1.AM_SALES_PERSON_EDA_DIM sp
on (a.SALE_PRSN_KEY = sp.SALE_PRSN_KEY
and a.ECLIPSE_REGN_NM = sp.ECLIPSE_REGN_NM)
)
with data
on commit preserve rows
;

编辑:

这是“固定查询”的 EXPLAIN 输出。两个查询的总估计时间相同,所以我不确定为什么“固定”查询实际上没有执行。

Explanation                                                              
 ------------------------------------------------------------------------ 
   1) First, we lock PRD_AM_1.AM_ORDER_EDA_FACT in view                  
      PRD_AM_BI_1.am_order_eda_fact in TD_MAP1 for access, we lock       
      PRD_AM_1.AM_SALES_PERSON_EDA_DIM in view                           
      PRD_AM_BI_1.AM_SALES_PERSON_EDA_DIM in TD_MAP1 for access, and we  
      lock PRD_AM_1.AM_SALES_OFFICE_EDA_DIM in view                      
      PRD_AM_BI_1.AM_SALES_OFFICE_EDA_DIM in TD_MAP1 for access.         
   2) Next, we execute the following steps in parallel.                  
        1) We do an all-AMPs RETRIEVE step in TD_MAP1 from               
           PRD_AM_1.AM_SALES_OFFICE_EDA_DIM in view                      
           PRD_AM_BI_1.AM_SALES_OFFICE_EDA_DIM by way of an all-rows     
           scan with no residual conditions into Spool 2 (all_amps),     
           which is redistributed by hash code to all AMPs to all AMPs   
           in TD_Map1.  Then we do a SORT to order Spool 2 by row hash.  
           The size of Spool 2 is estimated with high confidence to be   
           426 rows (21,300 bytes).  The estimated time for this step is 
           0.01 seconds.                                                 
        2) We do an all-AMPs RETRIEVE step in TD_MAP1 from               
           PRD_AM_1.AM_SALES_PERSON_EDA_DIM in view                      
           PRD_AM_BI_1.AM_SALES_PERSON_EDA_DIM by way of an all-rows     
           scan with no residual conditions into Spool 3 (all_amps),     
           which is redistributed by hash code to all AMPs to all AMPs   
           in TD_Map1.  Then we do a SORT to order Spool 3 by row hash.  
           The size of Spool 3 is estimated with high confidence to be   
           229,155 rows (10,770,285 bytes).  The estimated time for this 
           step is 0.04 seconds.                                         
        3) We do an all-AMPs RETRIEVE step in TD_MAP1 from               
           PRD_AM_1.AM_ORDER_EDA_FACT in view                            
           PRD_AM_BI_1.am_order_eda_fact by way of an all-rows scan with 
           no residual conditions into Spool 4 (all_amps), which is      
           redistributed by hash code to all AMPs to all AMPs in TD_Map1.
           Then we do a SORT to order Spool 4 by row hash.  The size of  
           Spool 4 is estimated with high confidence to be 2,041,151     
           rows (136,757,117 bytes).  The estimated time for this step   
           is 0.21 seconds.                                              
   3) We do an all-AMPs JOIN step in TD_Map1 from Spool 2 (Last Use) by  
      way of a RowHash match scan, which is joined to Spool 3 (Last Use) 
      by way of a RowHash match scan.  Spool 2 and Spool 3 are joined    
      using a merge join, with a join condition of ("ECLIPSE_REGN_NM =   
      ECLIPSE_REGN_NM").  The result goes into Spool 5 (all_amps), which 
      is built locally on the AMPs.  The size of Spool 5 is estimated    
      with no confidence to be 203,927 rows (17,129,868 bytes).  The     
      estimated time for this step is 0.02 seconds.                      
   4) We do an all-AMPs JOIN step in TD_Map1 from Spool 4 (Last Use) by  
      way of a RowHash match scan, which is joined to Spool 5 (Last Use) 
      by way of a RowHash match scan.  Spool 4 and Spool 5 are joined    
      using a merge join, with a join condition of ("(ECLIPSE_REGN_NM =  
      ECLIPSE_REGN_NM) AND ((SALE_PRSN_KEY = SALE_PRSN_KEY) AND          
      ((ECLIPSE_REGN_NM = ECLIPSE_REGN_NM) AND (SALE_OFFC_KEY =          
      SALE_OFFC_KEY )))").  The result goes into Spool 1 (group_amps),   
      which is built locally on the AMPs.  The size of Spool 1 is        
      estimated with no confidence to be 2,041,151 rows (293,925,744     
      bytes).  The estimated time for this step is 0.02 seconds.         
   -> The contents of Spool 1 are sent back to the user as the result of 
      statement 1.  The total estimated time is 0.24 seconds.            

标签: sqlinner-jointeradata

解决方案


查看解释计划以查看查询在何处花费最多的成本。然后专注于通过使用不同的连接、过滤器、子查询或使用索引来降低成本。


推荐阅读