首页 > 解决方案 > 如何使用聚合在 Hive 中透视数据

问题描述

我有一个像下面这样的表数据,我想用聚合来旋转数据。

ColumnA    ColumnB            ColumnC
1          complete            Yes
1          complete            Yes
2          In progress         No
2          In progress         No 
3          Not yet started     initiate 
3          Not yet started     initiate 

想要像下面这样旋转

ColumnA          Complete    In progress     Not yet started
1                 2               0                0
2                 0               2                0
3                 0               0                2

无论如何,我们可以在 hive 或 Impala 中实现这一点吗?

标签: apache-sparkhadoophiveimpala

解决方案


casesum聚合一起使用:

select ColumnA,    
       sum(case when ColumnB='complete'        then 1 else 0 end) as Complete,
       sum(case when ColumnB='In progress'     then 1 else 0 end) as In_progress,
       sum(case when ColumnB='Not yet started' then 1 else 0 end) as Not_yet_started
  from table
 group by ColumnA
 order by ColumnA --remove if order is not necessary
;

推荐阅读