首页 > 解决方案 > SQL:通过将子表连接到另一个来计算百分比

问题描述

在此处输入图像描述

我有上面的数据集,我需要每年报告当年只有女性演员的电影的百分比,以及当年制作的电影总数。例如,一个答案是:1990 31.81 13522 表示 1990 年有 13,522 部电影,31.81%

为了获得只有女性演员的动作,编写了以下代码:

SELECT a.year as Year, COUNT(a.title) AS Female_Movies, a.title 
FROM Movie a
WHERE a.title NOT IN (

  SELECT b.title from Movie b
  Inner Join M_cast c
  on TRIM(c.MID) = b.MID
  Inner Join Person d
  on TRIM(c.PID) = d.PID
  WHERE d.Gender='Male'
  GROUP BY b.title
  )
GROUP BY a.year,a.title
Order By a.year asc

每年的电影总数,可以使用以下方法找到:

SELECT a.year, count(a.title) AS Total_Movies
FROM Movie a
GROUP BY a.year
ORDER BY COUNT(a.title) DESC

结合我写的两者,以下代码:

SELECT z.year as Year, count(z.title) AS Total_Movies, count(x.title) as Female_movies, count(z.title)/ count(x.title) As percentage
FROM Movie z
Inner Join (
SELECT a.year as Year, COUNT(a.title) AS Female_Movies, a.title 
FROM Movie a
WHERE a.title NOT IN (

  SELECT b.title from Movie b
  Inner Join M_cast c
  on TRIM(c.MID) = b.MID
  Inner Join Person d
  on TRIM(c.PID) = d.PID
  WHERE d.Gender='Male'
  GROUP BY b.title
  )
GROUP BY a.year,a.title
Order By a.year asc
)x
on x.year = z.year 
GROUP BY z.year
ORDER BY COUNT(z.title) DESC

但是,在输出中,我正确地看到了只有女性电影的年份,但总电影的数量等于 female_movies 所以我得到了 1%,我尝试调试代码,但不确定哪里出错了。任何见解将不胜感激。

标签: mysqlsql

解决方案


您假设您的“z”包含所有电影,但由于您对女性电影进行了内部连接,它们也将只包含女性电影。您可以使用“左连接”来解决此问题。

假设您的两个查询是正确的,您可以像这样使用“WITH”加入它们:

WITH allmovies (year, cnt) as
(SELECT a.year, count(a.title) AS Total_Movies
 FROM Movie a
 GROUP BY a.year
 ORDER BY COUNT(a.title) DESC)
,

femalemovies (year, cnt, title) as
(SELECT a.year as Year, COUNT(a.title) AS Female_Movies, a.title 
FROM Movie a
WHERE a.title NOT IN (

  SELECT b.title from Movie b
  Inner Join M_cast c
  on TRIM(c.MID) = b.MID
  Inner Join Person d
  on TRIM(c.PID) = d.PID
  WHERE d.Gender='Male'
  GROUP BY b.title
  )
GROUP BY a.year,a.title
Order By a.year asc)
select * from allmovies left join femalemovies on allmovies.year = femalemovies.year

推荐阅读