首页 > 解决方案 > 如果产品是较早订购的

问题描述

桌子 :

order_date.     order_id.    customer_id.    product id
2020-01-01      O1.          C1.             P1
2020-01-01      O1.          C1.             P2
2020-01-01      O1.          C1.             P3
2020-01-10      O2.          C1.             P2
2020-01-10      O2.          C1.             P3
2020-01-10      O2.          C1.             P4
2020-01-10      O2.          C1.             P5

我使用了如下代码,但是对于大量数据(比如 10M 行),它会超时。有没有更好的方法来提高这个查询的性能?

select a.order_date, a.customer_id, a.order_id, a.product_id, case when b.product_id = a.product_id then 'Y' else 'N' end as prev_purchase_sts
from table a
left join table b on a.customer_id = b.customer_id and a.order_date > b.order_date

标签: sqlamazon-redshiftwindow-functionsexistsself-join

解决方案


如果您想检查客户之前是否订购过相同的产品,请使用EXISTS

SELECT a.*, 
       CASE 
         WHEN EXISTS (
                SELECT 1 
                FROM table b 
                WHERE a.customer_id = b.customer_id AND a.product_id = b.product_id AND a.order_date > b.order_date
              ) THEN 'Y' 
         ELSE 'N' 
       END AS prev_purchase_sts
FROM table a;

或者,LAG()窗口函数:

SELECT *, 
       CASE 
         WHEN LAG(order_date) OVER (PARTITION BY customer_id, product_id ORDER BY order_date) IS NOT NULL THEN 'Y' 
         ELSE 'N' 
       END AS prev_purchase_sts
FROM table;

或者,ROW_NUMBER()窗口函数:

SELECT *, 
       CASE 
         WHEN ROW_NUMBER() OVER (PARTITION BY customer_id, product_id ORDER BY order_date) > 1 THEN 'Y' 
         ELSE 'N' 
       END AS prev_purchase_sts
FROM table;

推荐阅读