首页 > 解决方案 > 按最常见类型的顺序获取电影,然后是关键字

问题描述

我有下表电影:

id    |                  title                           |    year 
315      Harry Potter and the Deathly Hallows: Part 2         2011
407      Cinderella                                           2015
826      The Shape of Water                                   2017
799      Enchanted                                            2007
523      How to Train Your Dragon                             2010
618      Crazy Rich Asians                                    2018

和表格类型:

movie_id   |    genre
315             adventure
315             fantasy
315             mystery
315             drama
407             drama
407             fantasy
826             drama
826             thriller
826             adventure
826             horror
799             fantasy
799             comedy
799             romance
523             drama
523             fantasy
618             romance
618             comedy

和 table 关键字:

movie_id     |       keyword 
315                  magic
315                  wizards
315                  witch
315                  friendship
315                  abuse 
407                  prince 
407                  fairy tale
407                  magic
407                  poor girl
407                  abuse 
826                  scientist
826                  mute
826                  friendship
799                  musical
799                  magic
799                  witch
799                  friendship
523                  viking
523                  boy
523                  fire
618                  singapore
618                  wedding
618                  money

我正在尝试构建一个查询,该查询输出与给定电影具有共同类型的所有电影。如果有电影具有相同数量的常见类型,那么我想按最大常见关键字的顺序对这些电影进行排名。

例如,如果电影是“哈利波特与死亡圣器:第 2 部分”,那么查询的输出将是:

title                      |      genre_frequency     |    keyword_frequency
Cinderella                        2                        2
The Shape of Water                2                        1
How to Train Your Dragon          2                        0
Enchanted                         1                        3

与指定电影没有任何共同类型的电影不包含在输出中(例如 Crazy Rich Asians)。

我有两个查询可以给我genre_frequency 和keyword_frequency。

select m.*, genre_frequency from movie m
join (
     select m.id, count(*) as genre_frequency
     from movie m
     join genre g on m.id=g.movie_id
     where g.genre in (select g1.genre 
                       from genre g1
                       where g1.movie_id=315)
     group by m.id
     ) f
 on m.id=f.id
 where m <> 315
 order by f.genre_frequency desc;


select m.*, keyword_frequency from movie m
join (
     select m.id, count(*) as keyword_frequency
     from movie m
     join keyword k on m.id=k.movie_id
     where k.keyword in (select k1.keyword
                       from keyword k1
                       where k1.movie_id=315)
     group by m.id
     ) f
 on m.id=f.id
 where m <> 315
 order by f.keyword_frequency desc;

问题是我想将上面的两个查询组合成一个查询,这样我就可以得到如上所示的输出表。我不确定我该怎么做。任何见解都值得赞赏。

标签: mysqlsql

解决方案


您可以尝试使用UNION ALL组合Genreskeyword表格并添加grp列来拆分结果集的两部分。然后使用条件聚合函数。

查询 #1

select m.title, 
      count(CASE WHEN t1.grp = 'g' THEN 1 END) as genre_frequency,
      count(CASE WHEN t1.grp = 'k' THEN 1 END) as keyword_frequency
from Movies m
join (
    SELECT movie_id,genre name,'g' grp
    FROM Genres 
    UNION ALL
    SELECT movie_id,keyword,'k' grp
    FROM keyword
) t1 on m.id=t1.movie_id
where (t1.name in (select g1.genre 
               from Genres g1
               where g1.movie_id=315) or 
      t1.name in (select k1.keyword 
               from keyword k1
               where k1.movie_id=315)) 
AND m.id <> 315
group by m.title;

| title                    | genre_frequency | keyword_frequency |
| ------------------------ | --------------- | ----------------- |
| Cinderella               | 2               | 2                 |
| Enchanted                | 1               | 3                 |
| How to Train Your Dragon | 2               | 0                 |
| The Shape of Water       | 2               | 1                 |

在 DB Fiddle 上查看


推荐阅读