首页 > 解决方案 > presto/aws athena - 选择最新版本的记录

问题描述

我有一个订单事件表,每个订单在完成时包含很少的条目。一些订单被取消或退款。我正在尝试选择最新版本状态为“OrderConfirmed”的所有订单的最新版本,我以为我会使用以下 SQL,但 AWS Athena 抱怨无法解析列“latest_order_update.latest_update”。有什么线索吗?

WITH latest_order_update AS (
  SELECT orderevent_order.unique_id, MAX(orderevent_order.event_time) AS latest_update
  FROM orderevent_order
  GROUP BY orderevent_order.unique_id)
SELECT orderevent_order.unique_id
FROM orderevent_order
WHERE orderevent_order.event_time = latest_order_update.latest_update AND orderevent_order.header_event_name = 'OrderConfirmed'
LIMIT 10;

标签: sqlamazon-athenapresto

解决方案


你可以重写它ROW_NUMBER

WITH cte AS (
 SELECT oo.unique_id,
    ,ROW_NUMBER() OVER(PARTITION BY unique_id ORDER BY event_time DESC) rn
  FROM orderevent_order oo
)
SELECT * 
FROM cte
WHERE rn = 1;

或者参考FROM/JOIN/subquery 中的 cte:

WITH latest_order_update AS (
  SELECT orderevent_order.unique_id, 
     MAX(orderevent_order.event_time) AS latest_update
  FROM orderevent_order
  GROUP BY orderevent_order.unique_id)
SELECT orderevent_order.unique_id
FROM orderevent_order
WHERE orderevent_order.event_time IN (SELECT l.latest_update 
                                      FROM latest_order_update l
                                      WHERE orderevent_order.unique_id 
                                         = l.unique_id)           
  AND orderevent_order.header_event_name = 'OrderConfirmed'
LIMIT 10;

加入:

WITH latest_order_update AS (
  SELECT orderevent_order.unique_id, 
     MAX(orderevent_order.event_time) AS latest_update
  FROM orderevent_order
  GROUP BY orderevent_order.unique_id)
SELECT orderevent_order.unique_id
FROM orderevent_order
JOIN latest_order_update
  ON orderevent_order.event_time = latest_order_update.latest_update
 AND orderevent_order.unique_id = latest_order_update.unique_id
WHERE orderevent_order.header_event_name = 'OrderConfirmed'
LIMIT 10;

推荐阅读