在PostGRES中,如何获取连续三周至少发生1次"事件"的所有用户?
my_events
id user_id created_at
------------------------------------------
1 1234 2015-05-05 22:49:18.345787
2 4567 2015-05-05 22:49:30.343386
3 1234 2015-05-05 22:50:16.465029
...
我尝试过的:
步骤1(获取用户发生事件的所有周数:
SELECT
user_id,
date_part('year', created_at) AS year,
date_part('week', created_at) AS week,
COUNT(1) AS events
FROM my_events
GROUP BY user_id, date_part('year', created_at), date_part('week', created_at)
步骤2(统计用户发生事件的所有周数:
SELECT
user_id,
COUNT(1) AS weeks_with_events
FROM (
-- Get all weeks where a user had an event.
SELECT
user_id,
date_part('year', created_at) AS year,
date_part('week', created_at) AS week,
COUNT(1) AS events
FROM my_events
GROUP BY user_id, date_part('year', created_at), date_part('week', created_at)
) a
GROUP BY user_id
不幸的是,这只会给我一个事件的周数,它不会检查它们是否是连续的周。
with x as
(SELECT
user_id,
date_part('year', created_at) AS year,
date_part('week', created_at) AS week
FROM my_events)
, y as (select user_id, year, week as curr,
coalesce(lead(week) over(partition by user_id,year order by week),0) as nxt,
coalesce(lag(week) over(partition by user_id,year order by week),0) as prev
from x)
select distinct user_id
from y
where nxt-curr = 1 and curr-prev = 1
您可以尝试这样做,因为您需要用户在连续3周内至少有一个事件。前提是,如果用户在连续3周内至少有一个事件,那么他将至少有一行nxt-curr
和curr-prev
为1。
我假设您正在查看这个查询:
SELECT user_id, date_part('year', created_at) AS yyyy,
date_part('week', created_at) AS ww, COUNT(8) AS events
FROM my_events
GROUP BY user_id, date_part('year', created_at), date_part('week', created_at);
虽然不是最通用的解决方案,但最简单的可能是使用lead()
/lag()
:
with t as (
SELECT user_id, date_part('year', created_at) AS yy,
date_part('week', created_at) AS ww, COUNT(8) AS events
FROM my_events
GROUP BY user_id, date_part('year', created_at), date_part('week', created_at)
)
select user_id, ww
from (select t.*,
lead(ww, 1) over (partition by user_id order by year, ww) as ww1,
lead(ww, 2) over (partition by user_id order by year, ww) as ww2
from t
) t
where ww1 = ww + 1 and ww2 = ww + 2;
这将在年底出现问题,因为一年中的一周的定义将重新开始。
编辑:
为了应付年底,你想做一些类似yyyy*52 + ww
的事情。唉,好几个星期都不行。
相反,选择一个原始的开始日期,该日期在一周中的正确日期。然后计算自该日期起的天数,除以7并截断。这会给你一个绝对的周数,与年份无关。从这一点来看,查询的其余部分基本相同。