首页 > 解决方案 > 在原始表上分组时选择连接表中最新记录中的列

问题描述

假设我们有三个 Postgres 表:

或使用简化的表定义:

    CREATE TABLE IF NOT EXISTS book_details(
        book_id bigint NOT NULL,
        title VARCHAR,
        category VARCHAR,
        author_id bigint NOT NULL,
        updated_at timestamp without time zone NOT NULL
    );

    CREATE TABLE IF NOT EXISTS book_rentals(
        rental_id bigint NOT NULL,
        book_id bigint NOT NULL,
        PRIMARY KEY (rental_id, book_id)
    );

    CREATE TABLE IF NOT EXISTS rental_events(
        rental_id bigint NOT NULL,
        reader_id bigint NOT NULL,
        started_at timestamp without time zone NOT NULL,
        ended_at timestamp without time zone NOT NULL
    );

现在让我们假设我们想要获得 5 本书最常租用的书和它们的最新书名(最新匹配的 book_details 条目中的书名)。什么是这样做的有效方法?(完成下面的伪查询。)

        SELECT COUNT(DISTINCT book_rentals.rental_id) AS rental_count,
               [[latest(book_details).title)]]
        FROM book_rentals
        INNER JOIN book_details
        ON book_rentals.book_id = book_details.book_id
        GROUP BY book_rentals.book_id
        ORDER BY rental_count DESC
        LIMIT 5;

最后是同样的问题,但仅考虑当前被认为属于给定类别的书籍,即仅适用于latest(book_details).category = 'Sci-Fi'.

标签: sqlpostgresql

解决方案


使用 aCTE返回每本书的最新观察结果并加入book_rentals和聚合:

WITH books AS (
  SELECT b.book_id, b.title, b.category
  FROM (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY book_id ORDER BY updated_at DESC) rn
    FROM book_details
  ) b  
  WHERE b.rn = 1
)  
SELECT b.title, COUNT(DISTINCT r.rental_id) AS rental_count
FROM books b INNER JOIN book_rentals r 
ON r.book_id = b.book_id
WHERE b.category = 'Sci-Fi'
GROUP BY b.book_id, b.title
ORDER BY rental_count DESC
LIMIT 5;

我不确定是否DISTINCT需要 inCOUNT(DISTINCT r.rental_id)或者您可以使用COUNT(*).
删除该WHERE子句,以便您的查询搜索所有书籍。


推荐阅读