首页 > 解决方案 > 获得每个评分和平均值的总票数。Postgres

问题描述

我有表:

| book_id | reader_id | rating
    1     |     2     |   1
    2     |     3     |   2
    2     |     2     |   5
    2     |     4     |   10

一个用户只能从 1 到 10 投票一次。

问题 1:获得平均图书评分和每个评分的票数的最佳方法是什么?喜欢:

 avr                     ratings
"3.6"   "2" "0" "0" "0" "0" "0" "0" "0" "1" "0"

现在我喜欢这样:

SELECT  
    AVG(rating),
    sum(case when rating  = 1 then 1  else 0 end) as r1,
    sum(case when rating  = 2 then 1  else 0 end) as r2,
    sum(case when rating  = 3 then 1  else 0 end) as r3,
    sum(case when rating  = 4 then 1  else 0 end) as r4,
    sum(case when rating  = 5 then 1  else 0 end) as r5,
    sum(case when rating  = 6 then 1  else 0 end) as r6,
    sum(case when rating  = 7 then 1  else 0 end) as r7,
    sum(case when rating  = 8 then 1  else 0 end) as r8,
    sum(case when rating  = 9 then 1  else 0 end) as r9,
    sum(case when rating  = 10 then 1  else 0 end) as r10
FROM books_rates
WHERE book_id=2;

问题2:为什么这个查询的结果是对象?

{
       avg: '1.00000000000000000000',
       r1: '3',
       r2: '0',
       r3: '0',
       r4: '0',
       r5: '0',
       r6: '0',
       r7: '0',
       r8: '0',
       r9: '0',
       r10: '0'
     }

问题 3:如果我在 book 表中创建列会更好,我将在其中存储上面的查询结果,以便在每次加载书籍时都不需要执行这个繁重的(是吗?)查询(当然这个结果列会如果有人给出新的价格会更新)?

标签: postgresql

解决方案


鉴于您的表中有干净的数据,这意味着您已经有限制,例如 book+reader 上的唯一键,以禁止同一用户为一本书投票两次,以及对评级字段的限制,不允许您插入除[1..10] 整数,窗口函数的以下用法可能有效:

create table ratings (
  book_id int,
  reader_id int,
  rating int
);

insert into ratings (book_id, reader_id, rating) values 
(1,2,1),
(1,10,1),
(1,101,2),
(2,3,2),
(2,2,5),
(2,4,10);


select book_id, reader_id, rating, count(reader_id) over (partition by book_id, rating) as same_rating_votes, avg(rating) over(partition by book_id) as book_avg_rating from ratings;

select sq.book_id, sq.rating, max(sq.same_rating_votes) as     same_rating_votes, max(sq.book_avg_rating) as book_avg from (
    select book_id, reader_id, rating, count(reader_id) over (partition by book_id, rating) as same_rating_votes, avg(rating) over(partition by book_id) as book_avg_rating from ratings ) as sq
group by sq.book_id, sq.rating;

http://www.sqlfiddle.com/#!17/eb4ea/2

http://www.sqlfiddle.com/#!17/eb4ea/7

在每一行中,您将参考一本书、读者、他的投票、该书的平均评分以及该书的相同投票数。如果您绝对需要将结果放在一行中,也许可以使用 array_agg 或其他东西进一步聚合。

说到问题 3,我相信如果你有数百万本书和选票,那么在物化视图中计算所有内容并在晚上重新计算可能是一个合理的想法。


推荐阅读