首页 > 解决方案 > 比较两个表后创建标志列

问题描述

我有两个不同的报告表,其中包含日期时间和报告所有者。我想选择至少写过一次报告的人。我还需要一个计算字段来显示他们写的报告编号。报告 1 优先,因此如果有人在任何时候编写了报告 1,则新的 report_number 列应为 1,否则为 2(用于报告 2)。

'people' table
| person_id | full_name
--------------------------
| 1         | John L Smith
| 2         | Carl M Selt
| 3         | Another Person

'report_1' table
| report_1_id | author_person_id | date_entered | other_columns
---------------------------------------------------------------
| 1           | 1                | 2018-01-12   | foo
| 2           | 1                | 2018-02-18   | foo foo

'report_2' table
| report_2_id | author_person_id | date_entered | other_columns
---------------------------------------------------------------
| 1           | 1                | 2018-03-21   | bar
| 2           | 1                | 2018-03-28   | bar bar
| 3           | 2                | 2018-04-16   | baz
| 4           | 2                | 2018-04-30   | baz baz

期望的结果:

| full_name    | report_number
---------------------------
| John L Smith | 1
| Carl M Smelt | 2

请注意,即使他还撰写了报告 2 ,Johnreport_number也是如此。1

报告 1 和报告 2 具有不同的附加列,即使它们在上面看起来相同。

我尝试了什么:

    /* Get people from both reports */
WITH report_1_people AS (
    SELECT P.full_name
    FROM report_1 R1
    INNER JOIN people P ON R1.author_person_id = P.person_id
    WHERE P.full_name IS NOT NULL 
    AND P.full_name <> ''
), report_2_people AS (
    SELECT P2.full_name
    FROM report_2 R2
    INNER JOIN people P2 ON R2.author_person_id = P2.person_id
    WHERE P2.full_name IS NOT NULL 
    AND P2.full_name <> ''
)
SELECT 
    P.full_name,
    CASE WHEN P.full_name IN ( /* Check if in report 1 */
                    SELECT full_name
                    FROM report_1)
                    THEN 1
            ELSE 2
            END AS report_number
FROM people P
WHERE P.full_name IS NOT NULL AND P.full_name <> ''
/* Eliminate duplicate names */
GROUP BY P.full_name 
/* Filter only who either authored report 1 or report 2 */
HAVING P.full_name IN (SELECT full_name
                       FROM report_1_people)
OR P.full_name IN (SELECT full_name
                   FROM report_2_people)

注意:GROUP BY由于某些原因存在重复条目,因此人员表中有一个。

查询花了很长时间才与数据库断开连接(24 小时以上),所以我认为我做错了什么。有没有更好的方法来完成这个基于两个表的标志计算列?对 SQL 来说相对较新,所以我想知道是否有另一种思维方式让我过度使用 SQL 逻辑。

标签: sqlsql-serverperformancetsqlsql-server-2017

解决方案


您可以使用OUTER APPLY

SELECT person_id, full_name, COALESCE(ca1.report_num, ca2.report_num)
FROM people
OUTER APPLY (SELECT TOP (1) 1 FROM report_1 WHERE author_person_id = people.person_id) AS ca1(report_num)
OUTER APPLY (SELECT TOP (1) 2 FROM report_2 WHERE author_person_id = people.person_id) AS ca2(report_num)

db<>fiddle 上的演示


推荐阅读