首页 > 解决方案 > 在 Hive 中计算不同的每一列

问题描述

给定下表:

--------------------------------------------------------------------------------------
| browser (col1)  | os (col2)     | device (col2)  |    ...   |     city (col650)    |       
--------------------------------------------------------------------------------------
| Chrome          | Android       | Samsung        |    ...   | Berlin               |
--------------------------------------------------------------------------------------
| Chrome          | Android       | Samsung        |    ...   | Cologne              |
--------------------------------------------------------------------------------------
| Mozilla         | Android       | Huawei         |    ...   | Munich               |
--------------------------------------------------------------------------------------
| Chrome          | Android       | Sony           |    ...   | Berlin               |
--------------------------------------------------------------------------------------

我想获得每列的不同值:

--------------------------------------------------------------------------------------
| browser (col1)  | os (col2)     | device (col2)  |    ...   |     city (col650)    |       
--------------------------------------------------------------------------------------
| 2               | 1             | 3              |    ...   | 4                    |
--------------------------------------------------------------------------------------

该表有 650 个不同的列,因此无法在查询中指定每一列。

标签: hivehiveqlhue

解决方案


您必须对所有 650 列执行此操作。将排名为 1 的所有行值相加。

 select
         sum(case when col1Rank=1 then 1 ekse 0 end) as col1,
         sum(case when col2Rank=1 then 1 else 0 end) as col2,
         sum(case when col3Rank=1 then 1 else 0 end) as col3

from 
(
    select
         row_number() over(partition by col1 order by col1) as col1Rank,
         row_number() over(partition by col2 order by col2) as col2Rank,
         row_number() over(partition by col3 order by col3) as col3Rank     
    from table_name
) A;

推荐阅读