hive - 基于另一列的最大值的新列重复值

问题描述

我有一个表'mytable'，其结果类似于下面

currenttime             racetype            raceid
2018-01-01 03:15:00     gold                22
2018-01-01 04:15:00     silver              22
2019-01-01 04:15:00     bronze              22
2017-01-02 11:44:00     platinum            22

我正在尝试根据最大当前时间创建另一列。它应该从最大当前时间中获取 racetype 的值，并为新列中的所有项目重复该条目，类似于下面

currenttime             racetype          raceid     besttype
2018-01-01 03:15:00     gold              22         bronze
2018-01-01 04:15:00     silver            22         bronze
2019-01-01 04:15:00     bronze            22         bronze
2017-01-02 11:44:00     platinum          22         bronze

如果还有其他种族ID，它应该对那些前做同样的事情

currenttime             racetype          raceid     besttype
2018-01-01 03:15:00     gold              22         bronze
2018-01-01 04:15:00     silver            22         bronze
2019-01-01 04:15:00     bronze            22         bronze
2017-01-02 11:44:00     platinum          22         bronze
2011-01-01 03:15:00     gold              09         silver
2022-01-01 04:15:00     silver            09         silver
2002-01-01 04:15:00     bronze            09         silver

目前我有一个查询

select mt.raceid, tt.racetype, MAX(tt.currenttime) 
OVER (PARTITION by mt.raceid) 
from mytable mt 
join tabletwo tt on mt.id = tt.id
where mt.raceid = 22

此查询未输出预期的输出

raceid         racetype         col0
22             gold             2019-01-01 04:15:00 
22             silver           2019-01-01 04:15:00 
22             platinum          2019-01-01 04:15:00 
22             bronze           2019-01-01 04:15:00

我怎样才能实现第二个和第三个示例中显示的上述预期结果？

标签： hivehiveql

使用first_value解析函数：

select currenttime, racetype, raceid,
       first_value(racetype) over(partition by raceid order by currenttime desc) as besttype
  from mytable

或者last_value：

select currenttime, racetype, raceid,
           last_value(racetype) over(partition by raceid order by currenttime) as besttype
  from mytable

hive - 基于另一列的最大值的新列重复值

问题描述

解决方案

推荐阅读