首页 > 解决方案 > Keep only most recent rows by date

问题描述

Having this table

CREATE TABLE BOOKINGS
    ([RES_ID] varchar(4), [ATTENDANCE_DATE] datetime, [AUTOID] int);

INSERT INTO BOOKINGS
    ([RES_ID], [ATTENDANCE_DATE], [AUTOID])
VALUES
    ('A001', '2018-01-01 00:00:00', 1),
    ('A002', '2018-01-01 00:00:00', 2),
    ('A003', '2018-01-01 00:00:00', 3),
    ('A001', '2018-01-02 00:00:00', 4),
    ('A002', '2018-01-02 00:00:00', 5),
    ('A003', '2018-01-02 00:00:00', 6),
    ('A002', '2018-01-03 00:00:00', 7),
    ('A003', '2018-01-03 00:00:00', 8);

I would like to remove all rows with id='A001' since at the most recent date, there is no reservation (i.e. was canceled).

I have tried this:

with cte as
(
  select *,
    row_number() over(partition by [res_id]
                      order by  [ATTENDANCE_DATE] desc) rn
  from BOOKINGS
)
DELETE FROM cte where rn > 1;

But this keeps the most recent line for 'A001' (id=4) and I don't want that.

Expected output is:

A002    2018-01-03 00:00:00.000 7
A003    2018-01-03 00:00:00.000 8

标签: sqlsql-server

解决方案


一种方法是将每个resid的最近日期与总体上的最近日期进行比较。您可以使用窗口函数执行此操作:

with todelete as (
      select b.*,
             max(attendance_date) over (partition by res_id) as max_ad_resid,
             max(attendance_date) over () as max_ad
      from bookings b
     )
delete from todelete
    where max_ad_resid < max_ad;

推荐阅读