首页 > 解决方案 > Hive 查询读取行并将其放入表列

问题描述

我正在创建具有不同国家日期的销售记录的蜂巢表。

create external table sales_records (sales_date string, country string, sales bigint)
--partitioned by (country string)
-- sorted by (sales_date)
row format delimited
fields terminated by ','
lines terminated by '\n'
stored as textfile;

我在里面插入了记录。

insert into sales_records values 
    ("2018-01-01", "us", 5000) ,
    ("2018-01-02", "us", 500) ,
    ("2018-01-03", "us", 5500) ,
    ("2018-01-04", "us", 1500) ,
    ("2018-01-01", "ind", 500) ,
    ("2018-01-03", "ind", 500) ,
    ("2018-01-04", "ind", 5500) ,
    ("2018-01-06", "ind", 1500) ;

现在我需要输出必须包含的记录:

us_sales = 当天在我们的销售额

ind_sales = 日期 ind 的销售额

差异 = | us_sales - ind_sales|

sales_date  | us_sales |  ind_sales  | difference
--------------------------------------------------
2018-01-01  |  5000    |   500       |  4500
-------------------------------------------------
2018-01-02  |  500     |     0       |   500
-------------------------------------------------
2018-01-03  |  5500    |   500       |  5000
-------------------------------------------------
2018-01-04  |  1500    |  5500       |  4000
-------------------------------------------------
2018-01-06  |     0    |  1500       |  1500
-------------------------------------------------

标签: hivehiveql

解决方案


case与聚合一起使用sales_date

select sales_date, us_sales,  ind_sales, abs(us_sales - ind_sales) as difference
from 
(
select sales_date, 
       sum(case when country='ind' then sales end)  as ind_sales,
       sum(case when country='us'  then sales end)  as us_sales
  from sales_records 
group by sales_date
)s
order by sales_date;

推荐阅读