首页 > 解决方案 > 大查询:加入第二个表中的单个最新行

问题描述

我有两张桌子。一个是 的列表Orders,一个是 的列表Events

对于每个,Order我想加入在.Eventclicked_atcreated_atOrder

我已经尝试了多种方法来使其正常工作,并在 Stack Overflow 上尝试了其他几个答案,但我正在努力返回正确的数据。

在我看来,子查询的 sudo 逻辑是这样的:

SELECT campaign, user_id, created_at 
FROM `Events`
WHERE order.user_id = user_id AND clicked_at < order.created_at
ORDER created_at DESC
LIMIT 1

请看下面的示例数据:

# Orders

| order_id | user_id | created_at |
-----------------------------------
| 123      | abc     | 2020-07-04 |
| 456      | abc     | 2020-05-01 |


# Events

| campaign | keyword  | user_id | clicked_at |
----------------------------------------------
| facebook | shoes    | abc     | 2020-07-03 |
| google   | hair     | abc     | 2020-07-01 |

我想要的结果

# Orders with campaign attribution

| order_id | user_id | created_at | campaign | keyword  |
---------------------------------------------------------
| 123      | abc     | 2020-07-04 | facebook | shoes    |
| 456      | abc     | 2020-06-04 | null     | null     | 

谢谢!亚历克斯

标签: mysqlgoogle-bigquery

解决方案


with orders as (
  select 123 as order_id, 'abc' as user_id, cast('2020-07-04' as date) as created_at union all
  select 456, 'abc', '2020-05-01'
),
events as (
  select 'facebook' as campaign, 'shoes' as keyword, 'abc' as user_id, cast('2020-07-03' as date) as clicked_at union all
  select 'google', 'hair', 'abc', '2020-07-01'
),
logic as (
  select
    orders.order_id, 
    orders.user_id, 
    orders.created_at, 
    events.clicked_at,
    events.campaign, 
    events.keyword, 
    row_number() over (partition by orders.order_id order by events.clicked_at desc) as rn
  from orders
  left join events 
  on orders.user_id = events.user_id and events.clicked_at < orders.created_at
)
select * except(rn)
from logic 
where rn = 1

推荐阅读