在SQL查询中仅包括转换状态



我有一张关于客户及其购买行为的表格,如下所示:

customer     shop       time
----------------------------
1          5        13.30   
1          5        14.33
1          10       22.17
2          3        12.15
2          1        13.30
2          1        15.55
2          3        17.29

由于我想在商店里换班,我需要以下输出

customer     shop       time
----------------------------
1          5        13.30   
1          10       22.17
2          3        12.15
2          1        13.30
2          3        17.29

我试过使用

ROW_NUMBER() OVER (PARTITION BY customer, shop  ORDER BY time ASC) AS a counter

然后只保留所有的CCD_ 1。然而,当客户稍后再次访问同一家商店时,这会困扰我,就像我的示例中的customer=2shop=3一样。

我想出了这个:

WITH a AS 
(
SELECT 
customer, shop, time, 
ROW_NUMBER() OVER (PARTITION BY customer ORDER BY time ASC) AS counter
FROM 
db
)
SELECT a1.*
FROM a a1
JOIN a AS a2 ON (a1.device = a2.device AND a2.counter1 + 1 = a1.counter1 AND a2.id <> a1.id)
UNION 
SELECT a.*
FROM a 
WHERE counter1 = 1

然而,这是非常低效的,在我的数据所在的AWS中运行它会导致一个错误,告诉我

以这个比例因子查询耗尽的资源

有什么方法可以提高查询效率吗?

这是一个缺口和孤岛问题。但最简单的解决方案使用lag():

select customer, shop, time
from (select t.*, lag(shop) over (partition by customer order by time) as prev_shop
from t
) t
where prev_shop is null or prev_shop <> shop;

相关内容

  • 没有找到相关文章

最新更新