首页 > 解决方案 > 如何根据psql中其他列的值删除一列中的重复项

问题描述

我有一个应该模仿图书馆管理系统的数据库。我想编写一个查询,显示一个表格,显示每个出版商借阅的前 3 本书,同时显示它们的相应排名(因此从出版商 X 借阅次数最多的书将显示排名 1)。我有一个查询,显示以下信息 - 借书的标题及其相应的出版商,以及每本书的借阅次数。如你看到的; 布卢姆斯伯里(英国)出现了 7 次(每本《哈利波特》书籍各出现一次),但我希望它只显示 3 部最受欢迎的《哈利波特》书籍的借阅次数。我非常感谢任何帮助。

                  title                   |       publisher        | times
------------------------------------------+------------------------+------
 Harry Potter and the Philosopher's Stone | Bloomsbury (UK)        |    2
 Harry Potter and the Deathly Hallows     | Bloomsbury (UK)        |    2
 Harry Potter the Goblet of Fire          | Bloomsbury (UK)        |    3
 The Fellowship of the Ring               | George Allen & Unwin   |    1
 Calculus                                 | Paerson Addison Wesley |    1
 Go Set a Watchman                        | HarperCollins          |    1
 Harry Potter the Half-Blood Prince       | Bloomsbury (UK)        |    4
 Harry Potter and the Chamber of Secrets  | Bloomsbury (UK)        |    3
 Harry Potter and Prisoner of Azkaban     | Bloomsbury (UK)        |    2
 Nineteen Eighty-Four                     | Secker & Warburg       |    1
 Harry Potter the Order of the Phoenix    | Bloomsbury (UK)        |    4
 To Kill a Mockingbird                    | J.B.Lippincott & Co    |    1

下面的查询将生成上面的视图。

SELECT title, publisher, COUNT(borrowed.resid) AS rank 
FROM borrowed 
  CROSS JOIN book 
  CROSS JOIN bookinfo 
WHERE borrowed.resid = book.resid 
  AND book.isbn = bookinfo.isbn 
  AND book.copynumber = borrowed.copynumber 
GROUP BY title, publisher;

标签: sqlpostgresqlgreatest-n-per-group

解决方案


SELECT title, publisher, times
FROM (
    SELECT *, RANK() OVER (PARTITION BY publisher ORDER BY times DESC) AS ranking
    FROM (
        SELECT title, publisher, COUNT(resid) AS times 
        FROM borrowed 
        JOIN book USING (resid, copynumber)
        JOIN bookinfo USING (isbn)
        GROUP BY title, publisher
    ) AS counts
) AS ranks
WHERE ranking <= 3
ORDER BY publisher, times DESC

counts是您编写的部分,已调整为利用USING从两侧组合相同命名的列(使其更短)

ranks是使用rank函数(窗口函数)对每个发布者进行排名的部分

最后,我们通过选择排名等于和低于 3 来获得前 3 名。


推荐阅读