首页 > 解决方案 > 循环遍历参数表并返回查询结果的并集

问题描述

date我有一个 Hive 表,我需要针对参数、identifier1、和的不同值运行类似于下面的查询identifier2,并将结果合并在一起。lowerupper

Select 
col1,
col2,
new_time,
sum(col3),
case 
when "date" between date1 and date2 then 'No'
when "date" between date3 and date4 then 'Yes'
end as date_group,
case when "date" < e then 'test1' else 'test2' end as test_group,
'identifier1' as ID,
'identifier2' as ID2
FROM Table1
WHERE (new_time between time1 and time2)
      AND (tag between 'lower' and 'upper')
GROUP BY 
col1,
col2,
new_time,
case 
when "date" between date1 and date2 then 'No'
when "date" between date3 and date4 then 'Yes'
end,
case when "date" < e then 'test' else 'test2' end 

我最初的想法是创建下面的参数表并循环遍历包含参数值组合的每一行并合并结果。

+------------+-------------+-------------+--------+-------+
|    date    | identifier1 | identifier2 | lower  | upper |
+------------+-------------+-------------+--------+-------+
| 2019-05-12 |           1 | A           |     10 |    20 |
| 2019-07-10 |           2 | B           |     30 |    40 |
| 2019-04-10 |           3 | C           |     60 |    70 |
| 2019-04-11 |           4 | D           |    423 |   500 |
| 2019-07-10 |           5 | E           |     85 |    88 |
+------------+-------------+-------------+--------+-------+

两个问题,我不确定如何解决这个问题,我不确定 hiveql 是否允许循环。我更喜欢 Hive 解决方案,但如果我能够将中间表移动到关系数据库,SQL 解决方案可以工作。解决方案相当于下面的联合查询,其中突出显示了参数值。

在此处输入图像描述

感谢您对解决方案的任何帮助,谢谢。

标签: sqlhivehiveql

解决方案


创建参数表并使用连接。我不是 100% 确定哪些是参数,哪些是列,但是像这样:

SELECT t1.col1, t1.col2, t1.new_time, sum(t1.col3),
       (case when "date" between a and b then 1
             when "date" between c and d then 2
        end) as date_group,
       (case when "date" < e then 'test1' else 'test2' end) as test_group,
      p.identifier as ID,
      p.identifier2 as ID2
FROM Table1 t1 CROSS JOIN
      params p
WHERE t1.new_time BETWEEN t1.time1 AND t1.time2 AND
      t1.tag BETWEEN p.lower AND p.upper
GROUP BY t1.col1, t1.col2, t1.new_time, 
      (case when p."date" between a and b then 1
            when p."date" between c and d then 2
       end),
      (case when p."date" < e then 'test' else 'test2' end),
      p.identifier, p.identifier2;

推荐阅读