首页 > 解决方案 > HiveQL 基于日期的行、列之间的差异

问题描述

我有一个表(t_stocks),其中包含如下数据:

exchanged,stock_symbol,closing_date,closing_price
NSE,TCS,2009-08-09,2200.1
NSE,TCS,2009-08-10,2300.1
NSE,TCS,2009-08-11,12200.1
NSE,TCS,2009-08-12,22300.1
NSE,TCS,2009-09-09,2200.1
NSE,TCS,2009-09-10,2300.1
NSE,TCS,2009-09-11,12200.1
NSE,TCS,2009-09-12,22300.1
NSE,INFY,2009-08-09,2500.34
NSE,INFY,2009-08-10,1500.34
NSE,INFY,2009-08-09,7500.34
NSE,INFY,2009-08-10,14500.34
NSE,INFY,2009-09-09,2500.34
NSE,INFY,2009-09-10,1500.34
NSE,INFY,2009-09-09,7500.34
NSE,INFY,2009-09-10,14500.34
NSE,TCS,2010-08-09,2200.1
NSE,TCS,2010-08-10,2300.1
NSE,TCS,2010-08-11,12200.1
NSE,TCS,2010-08-12,22300.1
NSE,TCS,2010-09-09,2200.1
NSE,TCS,2010-09-10,2300.1
NSE,TCS,2010-09-11,12200.1
NSE,TCS,2010-09-12,22300.1
NSE,INFY,2010-08-09,2500.34
NSE,INFY,2010-08-10,1500.34
NSE,INFY,2010-08-09,7500.34
NSE,INFY,2010-08-10,14500.34
NSE,INFY,2010-09-09,2500.34
NSE,INFY,2010-09-10,1500.34
NSE,INFY,2010-09-09,7500.34
NSE,INFY,2010-09-10,14500.34
...
...

我需要编写一个生成如下报告的查询。交换, stock_symbol , closing_date , closing_price ,昨天_close , diff_yesterday_price (昨天价格和今天价格之间的价格差异),输出如下:

+----------------+-------------------+-------------------+--------------------+------------------------+-----------------------+--+
| exchanged      |     stock_symbol  |     closing_date  |     closing_price  |     yesterday_closing  | diff_yesterday_price  |
+----------------+-------------------+-------------------+--------------------+------------------------+-----------------------+--+
| NSE            | INFY              | 2009-08-09        | 2500.34            | NULL                   | NULL                  |
| NSE            | INFY              | 2009-08-09        | 7500.34            | 2500.34                | -5000                 |
| NSE            | INFY              | 2009-08-10        | 14500.34           | 7500.34                | -7000                 |
| NSE            | INFY              | 2009-08-10        | 1500.34            | 14500.34               | 13000                 |
| NSE            | INFY              | 2009-09-09        | 7500.34            | 1500.34                | -6000                 |
| NSE            | INFY              | 2009-09-09        | 2500.34            | 7500.34                | 5000                  |
| NSE            | INFY              | 2009-09-10        | 14500.34           | 1500.34                | -13000                |
| NSE            | INFY              | 2009-09-10        | 1500.34            | 2500.34                | 1000                  |
| NSE            | INFY              | 2010-08-09        | 7500.34            | 14500.34               | 7000                  |
| NSE            | INFY              | 2010-08-09        | 2500.34            | 7500.34                | 5000                  |
.....
.....

任何人都可以给我一些线索以有效地做到这一点。

提前致谢,

问候。

标签: hivehiveql

解决方案


您可以使用蜂巢窗口功能lag()来解决这个问题。您可以在此处阅读有关 hive 中窗口函数的更多信息。

这是 中的有效演示PostgreSQL但同样的查询也适用HIVE

select
 exchanged,
 stock_symbol,
 closing_date,
 closing_price,
 yesterday_price,
 (yesterday_price - closing_price) as diff_yesterday_price
from
(
    select
        *,
        lag(closing_price) over (partition by stock_symbol order by closing_date) as yesterday_price
    from stockExchange
) la

order by
    stock_symbol,
    closing_date

推荐阅读