sql-server - 如何对此查询进行性能调整
问题描述
我有以下查询需要很长时间(大约 2 小时)才能执行:
CREATE TABLE #compareList
(
id INT IDENTITY(1,1),
poy_no varchar(max),
poy_stat_cd varchar(max),
poy_eff_dd datetime,
poy_exp_dd datetime,
[Name] [nvarchar] (max)
);
DECLARE @poy_no varchar(max), @poy_stat_cd varchar(max),
@poy_eff_dd datetime, @poy_exp_dd datetime, @remarks nvarchar(max)
DECLARE C_Compare CURSOR STATIC FOR
SELECT b.poy_no, b.poy_stat_cd, b.poy_eff_dd, b.poy_exp_dd, a.remarks
FROM table1 a
OPEN C_Compare
FETCH NEXT FROM C_Compare
INTO @poy_no, @poy_stat_cd, @poy_eff_dd, @poy_exp_dd, @remarks
WHILE @@FETCH_STATUS = 0
BEGIN
INSERT INTO #compareList
SELECT @poy_no, @poy_stat_cd, @poy_eff_dd, @poy_exp_dd, @remarks
FETCH NEXT FROM C_Compare
INTO @poy_no, @poy_stat_cd, @poy_eff_dd, @poy_exp_dd, @remarks
END
CLOSE C_Compare;
DEALLOCATE C_Compare;
-- This query has performance issue
SELECT
COUNT(1)
FROM
#compareList a,
(SELECT
pid, single_string_name, original_script_name,
surname, first_name, middle_name
FROM
DJ_PERSON WITH (INDEX (NCIndex_all_needed_columns))) AS p,
(SELECT pid, desc1 FROM PERSON_DESC) AS pd,
DESC1 AS d
WHERE
p.pid = pd.pid
AND pd.desc1 = d.d1id
AND replace(replace(replace(rtrim(ltrim(a.name)), ' ',''), ',',''), '.','') != ''
AND (replace(replace(replace(a.Name, ' ',''), ',',''), '.','') = replace(replace(replace(p.single_string_name, ' ',''), ',',''), '.','')
COLLATE database_default
OR replace(replace(replace(a.Name, ' ',''), ',',''), '.','') = replace(replace(replace(p.original_script_name, ' ',''), ',',''), '.','')
COLLATE database_default
OR
replace(replace(replace(a.Name, ' ',''), ',',''), '.','') = replace(replace(replace(p.surname+p.first_name+p.middle_name, ' ',''), ',',''), '.','')
)
下面是每个表的行数。表格PERSON
和PERSON_DESC
行数很多。
人 - 4638768
PERSON_DESC - 2040027
#compareList - 26
我试图在表
PERSON
和PERSON_DESC
.在表上
PERSON
我应用了索引pid, single_string_name, original_script_name, surname, first_name, middle_name
在表上
PERSON_DESC
我应用了索引pid, desc1
。
下面是统计参数
Table '#compareList________________________________________________________________________________________________________0000000001C5'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 1, logical reads 8055799, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Workfile'. Scan count 16, logical reads 43232, physical reads 5431, read-ahead reads 42753, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'PERSON'. Scan count 1, logical reads 42966, physical reads 1, read-ahead reads 10440, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'DESC'. Scan count 1, logical reads 7060, physical reads 1, read-ahead reads 7054, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'DESC1'. Scan count 1, logical reads 1, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
我可以进行哪些更改来改进此查询的执行时间?
解决方案
where
由于您在子句中进行的所有函数调用,您有一个严重的 sargability 问题。所以很少,如果有的话,索引将被使用。我有几个建议。
首先,如果您有任何方法可以在调用任何函数之前限制需要测试的记录,请执行此操作,将结果放入临时表中,然后where
针对它运行基于函数的子句。就像是:
select columns, compute columns that we can compute here (should be one side of the compare)
into #MyTempTable
from MyTable
where {my saragable conditions};
-- Potentially add some indexes to the temp table computed columns
select columns
from #MyTempTable
where {my unsaragable conditions};
其次,OR
多重条件是一个众所周知的性能问题。这可以通过UNION ALL
例如解决
SELECT {your query}
WHERE p.pid = pd.pid
AND pd.desc1 = d.d1id
AND replace(replace(replace(rtrim(ltrim(a.[Name])), ' ',''), ',',''), '.','') != ''
AND replace(replace(replace(a.[Name], ' ',''), ',',''), '.','') = replace(replace(replace(p.single_string_name, ' ',''), ',',''), '.','') COLLATE database_default
UNION ALL
SELECT {your query}
WHERE p.pid = pd.pid
AND pd.desc1 = d.d1id
AND replace(replace(replace(rtrim(ltrim(a.[Name])), ' ',''), ',',''), '.','') != ''
AND replace(replace(replace(a.[Name], ' ',''), ',',''), '.','') = replace(replace(replace(p.original_script_name, ' ',''), ',',''), '.','') COLLATE database_default
UNION ALL
SELECT {your query}
WHERE p.pid = pd.pid
AND pd.desc1 = d.d1id
AND replace(replace(replace(rtrim(ltrim(a.[Name])), ' ',''), ',',''), '.','') != ''
AND replace(replace(replace(a.[Name], ' ',''), ',',''), '.','') = replace(replace(replace(p.surname+p.first_name+p.middle_name, ' ',''), ',',''), '.','');
第三,前两个建议对您没有帮助,您可能需要考虑具体化您在where
条款中使用的数据。我的意思是作为一个例子:
replace(replace(replace(p.single_string_name, ' ',''), ',',''), '.','') COLLATE database_default
并将该值存储在表中的新列中p
,然后您可以对其进行索引。您可能必须编写触发器来保持它的维护。
话虽如此,鉴于您的部分数据已经在临时表中#compareList
,您应该直接将比较值存储在临时表中,即添加另一列存储:
replace(replace(replace(rtrim(ltrim(a.[Name])), ' ',''), ',',''), '.','')
然后可能索引它。
推荐阅读
- windows-10 - RDP 问题!提示输入用户名和密码
- scala - 如何为同一项目中的多个 Spark 应用程序指定不同的 log4j.properties 文件
- bazel - Bazel:在 genrule 中引用输出目录中的文件
- python - 无法使用 xpath 使用 selenium 选择元素
- python - 无法通过 POST [Python] 发送登录凭据
- karate - 我们可以使用空手道加特林在同一个包中运行多个功能文件吗
- javascript - 为什么 Typescript 中的集合在 Firebase 中更新两次?
- angularjs - 离子 EPERM:不允许操作,复制文件
- regex - Powershell 在输出前修改 Select-Object 属性
- javascript - 简单计数命令上的 TypeError