hadoop - 为每个赛季寻找得分最高的主队

问题描述

下面是我的蜂巢查询，试图找出每个赛季得分最高的主队。

select t1.season , max(t1.TOTAL_Goals) as Highest_Score
  from
 (select season, home_team_id, sum(home_goals) TOTAL_Goals
    from game_kpark
   group by season, home_team_id
 ) as t1
 group by t1.season

上面代码的结果如下表

t1.season   highest_score

20122013    118
20132014    179
20142015    174
20152016    173
20162017    204
20172018    185

如果我包括t1.home_team_idafterSELECT和GROUP BYend，它会返回每个赛季所有球队的总分，而不是最高分。

如何正确编写查询以查看每个赛季得分最高的相应球队？

标签： hadoophivehiveqlgreatest-n-per-group

使用rank()解析函数：

select s.season, s.home_team_id, s.TOTAL_Goals
from
(
select s.season, s.home_team_id, s.TOTAL_Goals, 
       rank() over(partition by season order by s.TOTAL_Goals desc) as rnk
  from
 (--team season totals
  select season, home_team_id, sum(home_goals) TOTAL_Goals 
    from game_kpark
   group by season, home_team_id
 ) s
) s
where rnk=1; --filter teams with highest rank per season

hadoop - 为每个赛季寻找得分最高的主队

问题描述

解决方案

推荐阅读