首页 > 解决方案 > 取消透视具有最新记录的列

问题描述

学生记录会根据主题和更新日期进行更新。学生可以注册一门或多门科目。我想获得每个学生记录的大多数学科更新日期和状态。

    CREATE TABLE Student
    (
        StudentID int,
        FirstName varchar(100),
        LastName varchar(100),
        FullAddress varchar(100),
        CityState varchar(100),
        MathStatus varchar(100),
        MUpdateDate datetime2,
        ScienceStatus varchar(100),
        SUpdateDate datetime2,
        EnglishStatus varchar(100),
        EUpdateDate datetime2
    );

所需的查询输出,我正在使用 CTE 方法,但试图找到替代和更好的方法。

SELECT StudentID, FirstName, LastName, FullAddress, CityState, [SubjectStatus], UpdateDate
    FROM Student


    ;WITH orginal AS
    (SELECT * FROM Student)
    ,Math as
    (
        SELECT DISTINCT StudentID, FirstName, LastName, FullAddress, CityState,
        ROW_NUMBER OVER (PARTITION BY StudentID, MathStatus ORDER BY MUpdateDate DESC) as rn
        , _o.MathStatus as SubjectStatus, _o.MupdateDate as UpdateDate
        FROM original as o
        left join orignal as _o on o.StudentID = _o.StudentID
        where _o.MathStatus is not null and _o.MUpdateDate is not null
    )
    ,Science AS
    (
    ...--Same as Math
    )
    ,English AS
    (
    ...--Same As Math
    )
    SELECT * FROM Math WHERE rn = 1 
    UNION 
    SELECT * FROM Science WHERE rn = 1
    UNION 
    SELECT * FROM English WHERE rn = 1

标签: tsql

解决方案


第一:不推荐以非规范化形式存储数据。一些数据模型的重新设计可能是有序的。网络上有多种关于数据规范化的资源,比如这个

现在,我根据您编写的查询对源表的填充方式进行了一些猜测。我生成了一些示例数据,可以显示源数据是如何创建的。除此之外,我还减少了列数以减少我的打字工作。一般方法应该仍然有效。

样本数据

create table Student
(
    StudentId int,
    StudentName varchar(15),
    MathStat varchar(5),
    MathDate date,
    ScienceStat varchar(5),
    ScienceDate date
);

insert into Student (StudentID, StudentName, MathStat, MathDate, ScienceStat, ScienceDate) values
(1, 'John Smith',   'A',  '2020-01-01', 'B',  '2020-05-01'),
(1, 'John Smith',   'A',  '2020-01-01', 'B+', '2020-06-01'),  -- B for Science was updated to B+ month later
(2, 'Peter Parker', 'F',  '2020-01-01', 'A',  '2020-05-01'),
(2, 'Peter Parker', 'A+', '2020-03-01', 'A',  '2020-05-01'),  -- Spider-Man would never fail Math, fixed...
(3, 'Tom Holland',  null, null,         'A',  '2020-05-01'),
(3, 'Tom Holland',  'A-', '2020-07-01', 'A',  '2020-05-01');  -- Tom was sick for Math, but got a second chance

解决方案

您的问题标题已包含单词unpivot。该词实际上作为关键字存在于 T-SQL 中。您可以在文档中了解unpivot关键字。您自己的解决方案已经包含公用表表达式,这些结构应该看起来很熟悉。

脚步:

  1. cte_unpivot= 取消透视所有行,创建一列并使用表达式Subject将相应的值 ( SubjectStat, Date) 放在它旁边。case
  2. cte_recent= 对行编号以查找每个学生和学科的最新行。
  3. 仅选择那些最近的行。

这给出了:

with cte_unpivot as
(
    select  up.StudentId,
            up.StudentName,
            case up.[Subject]
                when 'MathStat' then 'Math'
                when 'ScienceStat' then 'Science'
            end as [Subject],
            up.SubjectStat,
            case up.[Subject]
                when 'MathStat' then up.MathDate
                when 'ScienceStat' then up.ScienceDate
            end as [Date]
    from Student s
    unpivot ([SubjectStat] for [Subject] in ([MathStat], [ScienceStat])) up
),
cte_recent as
(
    select  cu.StudentId, cu.StudentName, cu.[Subject], cu.SubjectStat, cu.[Date],
            row_number() over (partition by cu.StudentId, cu.[Subject] order by cu.[Date] desc) as [RowNum]
    from cte_unpivot cu
)
select cr.StudentId, cr.StudentName, cr.[Subject], cr.SubjectStat, cr.[Date]
from cte_recent cr
where cr.RowNum = 1;

结果

StudentId   StudentName     Subject SubjectStat Date
----------- --------------- ------- ----------- ----------
1           John Smith      Math    A           2020-01-01
1           John Smith      Science B+          2020-06-01
2           Peter Parker    Math    A+          2020-03-01
2           Peter Parker    Science A           2020-05-01
3           Tom Holland     Math    A-          2020-07-01
3           Tom Holland     Science A           2020-05-01

推荐阅读