首页 > 解决方案 > SQL 查询 - 长时间运行/占用 CPU 资源

问题描述

您好,我有以下 SQL 查询,平均需要 40 分钟才能运行,它引用的其中一个表中有超过 700 万条记录。

我已经通过数据库调整顾问运行了这个并应用了所有建议,我还在 sql 的活动监视器中对其进行了评估,并且没有推荐进一步的索引等。

任何建议都会很棒,在此先感谢

WITH CTE AS 
(
    SELECT r.Id AS ResultId,
    r.JobId,
    r.CandidateId,
    r.Email,
    CAST(0 AS BIT) AS EmailSent,
    NULL AS EmailSentDate,
    'PICKUP' AS EmailStatus,
    GETDATE() AS CreateDate,
    C.Id AS UserId,
    C.Email AS UserEmail,
    NULL AS Subject
    FROM Result R
    INNER JOIN Job J ON R.JobId = J.Id
    INNER JOIN User C ON J.UserId = C.Id
    WHERE 
    ISNULL(J.Approved, CAST(0 AS BIT)) = CAST(1 AS BIT)
    AND ISNULL(J.Closed, CAST(0 AS BIT)) = CAST(0 AS BIT)
    AND ISNULL(R.Email,'') <> '' -- has an email address
    AND ISNULL(R.EmailSent, CAST(0 AS BIT)) = CAST(0 AS BIT) -- email has not been sent
    AND R.EmailSentDate IS NULL -- email has not been sent
    AND ISNULL(R.EmailStatus,'') = '' -- email has not been sent
    AND ISNULL(R.IsEmailSubscribe, 'True') <> 'False' -- not unsubscribed
    -- not already been emailed for this job
    AND NOT EXISTS (
        SELECT SMTP.Email
        FROM SMTP_Production SMTP
        WHERE SMTP.JobId = R.JobId AND SMTP.CandidateId = R.CandidateId
    )
    -- not unsubscribed
    AND NOT EXISTS (

        SELECT u.Id FROM Unsubscribe u
        WHERE  ISNULL(u.EmailAddress, '') = ISNULL(R.Email, '')

    )
    AND NOT EXISTS (
        SELECT SMTP.Id FROM SMTP_Production SMTP
        WHERE SMTP.EmailStatus = 'PICKUP' AND SMTP.CandidateId = R.CandidateId
    )   
    AND C.Id NOT IN (
        -- list of ids
    )
    AND J.Id NOT IN (
        -- list of ids
    )
    AND J.ClientId NOT IN 
    (
        -- list of ids
    )
)
INSERT INTO smtp_production (ResultId, JobId, CandidateId, Email, EmailSent, EmailSentDate, EmailStatus, CreateDate, ConsultantId, ConsultantEmail, Subject)
OUTPUT INSERTED.ResultId,GETDATE() INTO ResultstoUpdate
SELECT 
    CTE.ResultId,
    CTE.JobId,
    CTE.CandidateId,
    CTE.Email,
    CTE.EmailSent,
    CTE.EmailSentDate,
    CTE.EmailStatus,
    CTE.CreateDate,
    CTE.UserId,
    CTE.UserEmail,
    NULL
FROM CTE
  INNER JOIN 
    (
        SELECT *, row_number() over(partition by CTE.Email, CTE.CandidateId order by CTE.EmailSentDate desc) as rn
        FROM CTE

    ) DCTE ON CTE.ResultId = DCTE.ResultId AND DCTE.rn = 1

请在下面查看我的更新查询:

WITH CTE AS 
(
    SELECT R.Id AS ResultId,
    r.JobId,
    r.CandidateId,
    R.Email,
    CAST(0 AS BIT) AS EmailSent,
    NULL AS EmailSentDate,
    'PICKUP' AS EmailStatus,
    GETDATE() AS CreateDate,
    C.Id AS UserId,
    C.Email AS UserEmail,
    NULL AS Subject
    FROM RESULTS R
    INNER JOIN JOB J ON R.JobId = J.Id
    INNER JOIN Consultant C ON J.UserId = C.Id
    WHERE 
    J.DCApproved = 1
    AND (J.Closed = 0 OR J.Closed IS NULL)
    AND (R.Email <> '' OR R.Email IS NOT NULL)
    AND (R.EmailSent = 0 OR R.EmailSent IS NULL)
    AND R.EmailSentDate IS NULL -- email has not been sent
    AND (R.EmailStatus = '' OR R.EmailStatus IS NULL)
    AND (R.IsEmailSubscribe = 'True' OR R.IsEmailSubscribe IS NULL)
    -- not already been emailed for this job
    AND NOT EXISTS (
        SELECT SMTP.Email
        FROM SMTP_Production SMTP
        WHERE SMTP.JobId = R.JobId AND SMTP.CandidateId = R.CandidateId
    )
    -- not unsubscribed
    AND NOT EXISTS (

        SELECT u.Id FROM Unsubscribe u
        WHERE (u.EmailAddress = R.Email OR (u.EmailAddress IS NULL AND R.Email IS NULL))

    )
    AND NOT EXISTS (
        SELECT SMTP.Id FROM SMTP_Production SMTP
        WHERE SMTP.EmailStatus = 'PICKUP' AND SMTP.CandidateId = R.CandidateId
    )   
    AND C.Id NOT IN (
        -- LIST OF IDS
    )
    AND J.Id NOT IN (
        -- LIST OF IDS
    )
    AND J.ClientId NOT IN 
    (
        -- LIST OF IDS
    )
)

INSERT INTO smtp_production (ResultId, JobId, CandidateId, Email, EmailSent, EmailSentDate, EmailStatus, CreateDate, UserId, UserEmail, Subject)
OUTPUT INSERTED.ResultId,GETDATE() INTO ResultstoUpdate
SELECT 
    CTE.ResultId,
    CTE.JobId,
    CTE.CandidateId,
    CTE.Email,
    CTE.EmailSent,
    CTE.EmailSentDate,
    CTE.EmailStatus,
    CTE.CreateDate,
    CTE.UserId,
    CTE.UserEmail,
    NULL
FROM CTE
  INNER JOIN 
    (
        SELECT *, row_number() over(partition by CTE.Email, CTE.CandidateId order by CTE.EmailSentDate desc) as rn
        FROM CTE

    ) DCTE ON CTE.ResultId = DCTE.ResultId AND DCTE.rn = 1


GO

标签: sqlsql-serverdatabasedatabase-performancesqlperformance

解决方案


ISNULL在你的WHEREand子句中使用JOIN可能是这里的主要原因。对查询中的列使用函数会导致查询变为非 SARGable(这意味着它不能使用表上的任何索引,因此它可以扫描整个事情)。笔记; 对变量使用函数,通常WHERE很好。例如. 诸如具有“包罗万象的查询”的味道之类的东西,可能会成为性能打击者;取决于你的设置。但这不是手头的讨论。WHERE SomeColumn = DATEADD(DAY, @n, @SomeDate)WHERE SomeColumn = ISNULL(@Variable,0)

因此,对于这样的子句ISNULL(J.Closed, CAST(0 AS BIT)) = CAST(0 AS BIT),查询优化器非常头疼,并且您的查询中充斥着它们。您需要将这些替换为以下子句:

WHERE (J.Closed = 0 OR J.Closed IS NULL)

虽然没有区别,但也没有必要去CAST那里0。SQL Server 可以看到您正在与 a 进行比较,bit因此也会将 a 解释0为 a。

您还有一个EXISTSwithWHERE子句ISNULL(u.EmailAddress, '') = ISNULL(R.Email, '')。这将需要成为:

WHERE (u.EmailAddress = R.Email
  OR   (u.EmailAddress IS NULL AND R.Email IS NULL))

您需要更改子句(CTE 和子查询)中的所有ISNULL用法,您应该会看到性能得到不错的提升。WHERE


推荐阅读