表格:
order_date. order_id. customer_id. product id
2020-01-01 O1. C1. P1
2020-01-01 O1. C1. P2
2020-01-01 O1. C1. P3
2020-01-10 O2. C1. P2
2020-01-10 O2. C1. P3
2020-01-10 O2. C1. P4
2020-01-10 O2. C1. P5
我使用的代码如下所示,但对于大量数据(例如10M行),它会超时。有更好的方法来提高这个查询的性能吗?
select a.order_date, a.customer_id, a.order_id, a.product_id, case when b.product_id = a.product_id then 'Y' else 'N' end as prev_purchase_sts
from table a
left join table b on a.customer_id = b.customer_id and a.order_date > b.order_date
如果要检查客户以前是否订购过相同的产品,请使用EXISTS
:
SELECT a.*,
CASE
WHEN EXISTS (
SELECT 1
FROM table b
WHERE a.customer_id = b.customer_id AND a.product_id = b.product_id AND a.order_date > b.order_date
) THEN 'Y'
ELSE 'N'
END AS prev_purchase_sts
FROM table a;
或,LAG()
窗函数:
SELECT *,
CASE
WHEN LAG(order_date) OVER (PARTITION BY customer_id, product_id ORDER BY order_date) IS NOT NULL THEN 'Y'
ELSE 'N'
END AS prev_purchase_sts
FROM table;
或,ROW_NUMBER()
窗函数:
SELECT *,
CASE
WHEN ROW_NUMBER() OVER (PARTITION BY customer_id, product_id ORDER BY order_date) > 1 THEN 'Y'
ELSE 'N'
END AS prev_purchase_sts
FROM table;