首页 > 解决方案 > Collect_set 与 Hive 中的案例

问题描述

有没有办法重写下面的 case 语句,而不是写 Collect_set[0] 4 次,我可以使用单个 Collect_set 获得相同的结果。

    select id,collect_set(name)[0] as name,sum(salary),
    cASE WHEN month(to_date(from_unixtime(unix_timestamp(collect_set(date1)[0], 'dd-MM-yyyy')))) 
    IN (01,02,03) THEN 
    CONCAT(CONCAT(year(to_date(from_unixtime(unix_timestamp(collect_set(date1)[0], 'dd-MM-yyyy'))))-1,'-'),
    substr(year(to_date(from_unixtime(unix_timestamp(collect_set(date1)[0], 'dd-MM-yyyy')))),3,4))
     ELSE CONCAT(CONCAT(year(to_date(from_unixtime(unix_timestamp(collect_set(date1)[0], 'dd-MM-yyyy')))),'-'),
    SUBSTR(year(to_date(from_unixtime(unix_timestamp(collect_set(date1)[0], 'dd-MM-yyyy'))))+1,3,4)) 
     END as fy from testing_1.collect_set_test group by id;

我在查询下面写。

select collect_set(CASE WHEN month(to_date(from_unixtime(unix_timestamp(date1), 'dd-MM-yyyy'))) 
IN (01,02,03) THEN CONCAT(CONCAT(year(to_date(from_unixtime(unix_timestamp(date1), 'dd-MM-yyyy')))-1,'-'),
substr(year(to_date(from_unixtime(unix_timestamp(date1), 'dd-MM-yyyy'))),3,4)) 
ELSE
 CONCAT(CONCAT(year(to_date(from_unixtime(unix_timestamp(date1), 'dd-MM-yyyy'))),'-'),
 SUBSTR(year(to_date(from_unixtime(unix_timestamp(date1), 'dd-MM-yyyy')))+1,3,4))) [0]
 END as fy from testing_1.collect_set_test group by id;

但它的给出低于错误。

    FAILED: ParseException line 1:446 missing KW_END at ')' near ']' in selection target
    line 1:452 cannot recognize input near 'END' 'as' 'fy' in selection target

有人可以指导我如何重写相同的内容。

标签: arrayshivehiveql

解决方案


将所有具有组和日期转换的聚合移动到子查询中,在上子查询中计算 fy:

select id, name, salary,
    cASE WHEN month(date1) 
               IN (01,02,03) THEN CONCAT(CONCAT(year(date1))-1,'-'),
                              substr(year(date1),3,4))
         ELSE CONCAT(CONCAT(year(date1),'-'),
              SUBSTR(year(date1)+1,3,4)) 
     END as fy 
     from 
          (select to_date(from_unixtime(unix_timestamp(collect_set(date1)[0], 'dd-MM-yyyy'))) as date1, 
                  collect_set(name)[0]  as name, 
                  sum(salary) as salary, 
                  id 
            from testing_1.collect_set_test group by id) s
 ;

推荐阅读