首页 > 解决方案 > 哈希键列的隐式转换导致插入速度非常慢

问题描述

我正在尝试将一些数据插入到一些插入时间太长的阶段表中。例如,一个包含多达 600000 条记录的表需要将近一个小时才能完成。在查询的选择部分,我们正在创建列的哈希,稍后用于更改检测。由于我们使用的数据仓库方法 DataVault,我们无法删除更改哈希。当我查看执行计划时,我看到表达式中的类型转换警告。示例如下所示。

<Warnings>
              <PlanAffectingConvert ConvertIssue="Cardinality Estimate" Expression="CONVERT(varchar(max),[load].[load_acbs_loan_product_dimension].[product_month],0)" />

我还包含了我们正在使用的插入语句的示例:

      INSERT INTO [stage_acbs_balance_category_dimension]
      ( hk_h_balance_category_dimension
      , balance_category_key
      , balance_category_code
      , balance_category_description
      , balance_class_code
      , balance_class_description
      , user_define_code_1
      , user_define_code_2
      , user_define_code_3
      , user_define_code_4
      , user_define_code_5
      , bal_cat_short_name
      , include_bal_in_tax_reporting
      , include_in_billings_statements
      , include_in_past_due_reporting
      , dss_change_hash_acbs_balance_category_dimension_lroc
      , dss_record_source
      , dss_load_date
      , dss_create_time)
      SELECT  CAST(HASHBYTES('sha2_256',
               COALESCE(CAST(load_acbs_balance_category_dimension.balance_category_key AS VARCHAR(MAX)),'null')
               ) AS BINARY(32)) AS hk_h_balance_category_dimension 
           , load_acbs_balance_category_dimension.balance_category_key AS balance_category_key 
           , load_acbs_balance_category_dimension.balance_category_code AS balance_category_code 
           , load_acbs_balance_category_dimension.balance_category_description AS balance_category_description 
           , load_acbs_balance_category_dimension.balance_class_code AS balance_class_code 
           , load_acbs_balance_category_dimension.balance_class_description AS balance_class_description 
           , load_acbs_balance_category_dimension.user_define_code_1 AS user_define_code_1 
           , load_acbs_balance_category_dimension.user_define_code_2 AS user_define_code_2 
           , load_acbs_balance_category_dimension.user_define_code_3 AS user_define_code_3 
           , load_acbs_balance_category_dimension.user_define_code_4 AS user_define_code_4 
           , load_acbs_balance_category_dimension.user_define_code_5 AS user_define_code_5 
           , load_acbs_balance_category_dimension.bal_cat_short_name AS bal_cat_short_name 
           , load_acbs_balance_category_dimension.include_bal_in_tax_reporting AS include_bal_in_tax_reporting 
           , load_acbs_balance_category_dimension.include_in_billings_statements AS include_in_billings_statements 
           , load_acbs_balance_category_dimension.include_in_past_due_reporting AS include_in_past_due_reporting 
           , CAST(HASHBYTES('SHA2_256',
               COALESCE(CAST(load_acbs_balance_category_dimension.balance_category_code AS VARCHAR(MAX)),'null') +'||'+
               COALESCE(CAST(load_acbs_balance_category_dimension.balance_category_description AS VARCHAR(MAX)),'null') +'||'+
               COALESCE(CAST(load_acbs_balance_category_dimension.balance_class_code AS VARCHAR(MAX)),'null') +'||'+
               COALESCE(CAST(load_acbs_balance_category_dimension.balance_class_description AS VARCHAR(MAX)),'null') +'||'+
               COALESCE(CAST(load_acbs_balance_category_dimension.user_define_code_1 AS VARCHAR(MAX)),'null') +'||'+
               COALESCE(CAST(load_acbs_balance_category_dimension.user_define_code_2 AS VARCHAR(MAX)),'null') +'||'+
               COALESCE(CAST(load_acbs_balance_category_dimension.user_define_code_3 AS VARCHAR(MAX)),'null') +'||'+
               COALESCE(CAST(load_acbs_balance_category_dimension.user_define_code_4 AS VARCHAR(MAX)),'null') +'||'+
               COALESCE(CAST(load_acbs_balance_category_dimension.user_define_code_5 AS VARCHAR(MAX)),'null') +'||'+
               COALESCE(CAST(load_acbs_balance_category_dimension.bal_cat_short_name AS VARCHAR(MAX)),'null') +'||'+
               COALESCE(CAST(load_acbs_balance_category_dimension.include_bal_in_tax_reporting AS VARCHAR(MAX)),'null') +'||'+
               COALESCE(CAST(load_acbs_balance_category_dimension.include_in_billings_statements AS VARCHAR(MAX)),'null') +'||'+
               COALESCE(CAST(load_acbs_balance_category_dimension.include_in_past_due_reporting AS VARCHAR(MAX)),'null')
               ) AS BINARY(32)) AS dss_change_hash_acbs_balance_category_dimension_lroc 
           , load_acbs_balance_category_dimension.dss_record_source AS dss_record_source 
           , load_acbs_balance_category_dimension.dss_load_date AS dss_load_date 
           , getdate() AS dss_create_time 
      FROM [load_acbs_balance_category_dimension] load_acbs_balance_category_dimension

我正在寻找一种方法来摆脱隐式转换并让插入执行。我最初想让列成为计算的持久散列,但散列是不确定的。任何想法将不胜感激。

标签: sql-serverperformancedata-warehouse

解决方案


我选择将 VARCHAR(MAX) 更改为 VARBINARY(MAX,这极大地提高了性能,从 1 小时缩短到 40 秒。我还希望更改散列算法。


推荐阅读