sql - SQL 查询 - 长时间运行/占用 CPU 资源
问题描述
您好,我有以下 SQL 查询,平均需要 40 分钟才能运行,它引用的其中一个表中有超过 700 万条记录。
我已经通过数据库调整顾问运行了这个并应用了所有建议,我还在 sql 的活动监视器中对其进行了评估,并且没有推荐进一步的索引等。
任何建议都会很棒,在此先感谢
WITH CTE AS
(
SELECT r.Id AS ResultId,
r.JobId,
r.CandidateId,
r.Email,
CAST(0 AS BIT) AS EmailSent,
NULL AS EmailSentDate,
'PICKUP' AS EmailStatus,
GETDATE() AS CreateDate,
C.Id AS UserId,
C.Email AS UserEmail,
NULL AS Subject
FROM Result R
INNER JOIN Job J ON R.JobId = J.Id
INNER JOIN User C ON J.UserId = C.Id
WHERE
ISNULL(J.Approved, CAST(0 AS BIT)) = CAST(1 AS BIT)
AND ISNULL(J.Closed, CAST(0 AS BIT)) = CAST(0 AS BIT)
AND ISNULL(R.Email,'') <> '' -- has an email address
AND ISNULL(R.EmailSent, CAST(0 AS BIT)) = CAST(0 AS BIT) -- email has not been sent
AND R.EmailSentDate IS NULL -- email has not been sent
AND ISNULL(R.EmailStatus,'') = '' -- email has not been sent
AND ISNULL(R.IsEmailSubscribe, 'True') <> 'False' -- not unsubscribed
-- not already been emailed for this job
AND NOT EXISTS (
SELECT SMTP.Email
FROM SMTP_Production SMTP
WHERE SMTP.JobId = R.JobId AND SMTP.CandidateId = R.CandidateId
)
-- not unsubscribed
AND NOT EXISTS (
SELECT u.Id FROM Unsubscribe u
WHERE ISNULL(u.EmailAddress, '') = ISNULL(R.Email, '')
)
AND NOT EXISTS (
SELECT SMTP.Id FROM SMTP_Production SMTP
WHERE SMTP.EmailStatus = 'PICKUP' AND SMTP.CandidateId = R.CandidateId
)
AND C.Id NOT IN (
-- list of ids
)
AND J.Id NOT IN (
-- list of ids
)
AND J.ClientId NOT IN
(
-- list of ids
)
)
INSERT INTO smtp_production (ResultId, JobId, CandidateId, Email, EmailSent, EmailSentDate, EmailStatus, CreateDate, ConsultantId, ConsultantEmail, Subject)
OUTPUT INSERTED.ResultId,GETDATE() INTO ResultstoUpdate
SELECT
CTE.ResultId,
CTE.JobId,
CTE.CandidateId,
CTE.Email,
CTE.EmailSent,
CTE.EmailSentDate,
CTE.EmailStatus,
CTE.CreateDate,
CTE.UserId,
CTE.UserEmail,
NULL
FROM CTE
INNER JOIN
(
SELECT *, row_number() over(partition by CTE.Email, CTE.CandidateId order by CTE.EmailSentDate desc) as rn
FROM CTE
) DCTE ON CTE.ResultId = DCTE.ResultId AND DCTE.rn = 1
请在下面查看我的更新查询:
WITH CTE AS
(
SELECT R.Id AS ResultId,
r.JobId,
r.CandidateId,
R.Email,
CAST(0 AS BIT) AS EmailSent,
NULL AS EmailSentDate,
'PICKUP' AS EmailStatus,
GETDATE() AS CreateDate,
C.Id AS UserId,
C.Email AS UserEmail,
NULL AS Subject
FROM RESULTS R
INNER JOIN JOB J ON R.JobId = J.Id
INNER JOIN Consultant C ON J.UserId = C.Id
WHERE
J.DCApproved = 1
AND (J.Closed = 0 OR J.Closed IS NULL)
AND (R.Email <> '' OR R.Email IS NOT NULL)
AND (R.EmailSent = 0 OR R.EmailSent IS NULL)
AND R.EmailSentDate IS NULL -- email has not been sent
AND (R.EmailStatus = '' OR R.EmailStatus IS NULL)
AND (R.IsEmailSubscribe = 'True' OR R.IsEmailSubscribe IS NULL)
-- not already been emailed for this job
AND NOT EXISTS (
SELECT SMTP.Email
FROM SMTP_Production SMTP
WHERE SMTP.JobId = R.JobId AND SMTP.CandidateId = R.CandidateId
)
-- not unsubscribed
AND NOT EXISTS (
SELECT u.Id FROM Unsubscribe u
WHERE (u.EmailAddress = R.Email OR (u.EmailAddress IS NULL AND R.Email IS NULL))
)
AND NOT EXISTS (
SELECT SMTP.Id FROM SMTP_Production SMTP
WHERE SMTP.EmailStatus = 'PICKUP' AND SMTP.CandidateId = R.CandidateId
)
AND C.Id NOT IN (
-- LIST OF IDS
)
AND J.Id NOT IN (
-- LIST OF IDS
)
AND J.ClientId NOT IN
(
-- LIST OF IDS
)
)
INSERT INTO smtp_production (ResultId, JobId, CandidateId, Email, EmailSent, EmailSentDate, EmailStatus, CreateDate, UserId, UserEmail, Subject)
OUTPUT INSERTED.ResultId,GETDATE() INTO ResultstoUpdate
SELECT
CTE.ResultId,
CTE.JobId,
CTE.CandidateId,
CTE.Email,
CTE.EmailSent,
CTE.EmailSentDate,
CTE.EmailStatus,
CTE.CreateDate,
CTE.UserId,
CTE.UserEmail,
NULL
FROM CTE
INNER JOIN
(
SELECT *, row_number() over(partition by CTE.Email, CTE.CandidateId order by CTE.EmailSentDate desc) as rn
FROM CTE
) DCTE ON CTE.ResultId = DCTE.ResultId AND DCTE.rn = 1
GO
解决方案
ISNULL
在你的WHERE
and子句中使用JOIN
可能是这里的主要原因。对查询中的列使用函数会导致查询变为非 SARGable(这意味着它不能使用表上的任何索引,因此它可以扫描整个事情)。笔记; 对变量使用函数,通常WHERE
很好。例如. 诸如具有“包罗万象的查询”的味道之类的东西,可能会成为性能打击者;取决于你的设置。但这不是手头的讨论。WHERE SomeColumn = DATEADD(DAY, @n, @SomeDate)
WHERE SomeColumn = ISNULL(@Variable,0)
因此,对于这样的子句ISNULL(J.Closed, CAST(0 AS BIT)) = CAST(0 AS BIT)
,查询优化器非常头疼,并且您的查询中充斥着它们。您需要将这些替换为以下子句:
WHERE (J.Closed = 0 OR J.Closed IS NULL)
虽然没有区别,但也没有必要去CAST
那里0
。SQL Server 可以看到您正在与 a 进行比较,bit
因此也会将 a 解释0
为 a。
您还有一个EXISTS
withWHERE
子句ISNULL(u.EmailAddress, '') = ISNULL(R.Email, '')
。这将需要成为:
WHERE (u.EmailAddress = R.Email
OR (u.EmailAddress IS NULL AND R.Email IS NULL))
您需要更改子句(CTE 和子查询)中的所有ISNULL
用法,您应该会看到性能得到不错的提升。WHERE
推荐阅读
- c - 使用英特尔 MKL 编译 C 代码:ld:未知选项:--no-as-needed
- python - 如何使用计算机视觉 + Tesseract 读取电表?
- angular - Angular 7 应用程序中后端处理发送的 302 响应处理
- spring-boot - 带有外部身份提供者的 Keycloak - 没有来自服务器的 access_token
- c# - 是否可以对这种方法进行单元测试
- jdbc - 如何在有/没有数据流的情况下同步 Bigquery 和 Oracle 数据库?
- javascript - 异步函数与非异步函数中的异步函数?
- bash - 如何使用 bash 脚本重命名文件
- c - 理解 C 中定义变量的宏函数
- python - 使用 pandas 绘图时,错误-“索引的真值不明确。使用 a.empty、a.bool()、a.item()、a.any() 或 a.all()”是什么意思?