在postgres中连续计算天数



我试图用以下结构计算两个表中连续天数:

| id | email | timestamp |
| -------- | -------------- | -------------- |
| 1    | hello@example.com  | 2021-10-22 00:35:22 | 
| 2    | hello2@example.com  | 2021-10-21 21:17:41 |
| 1    | hello@example.com  | 2021-10-19 00:35:22 | 
| 1    | hello@example.com  | 2021-10-18 00:35:22 | 
| 1    | hello@example.com  | 2021-10-17 00:35:22 | 

我想计算连续活动的天数。上面的数据显示:

| id | email | length |
| -------- | -------------- | -- |
| 1    | hello@example.com  | 1 | 
| 2    | hello2@example.com  | 1 |
| 1    | hello@example.com  | 3 | 

这变得更加困难,因为我需要使用UNION(或类似的东西)连接两个表,然后运行分组。我试图建立在这个查询(在postgres中找到一个系列的长度),但我无法按连续的天分组。

select max(id) as max_id, email, count(*) as length
from (
select *, row_number() over wa - row_number() over wp as grp
from began_playing_video
window
wp as (partition by email order by id desc),
wa as (order by id desc)
) s
group by email, grp
order by 1 desc

关于我如何在Postgres中做到这一点的任何想法?

首先创建一个aggregate函数,以便在一个上升有序列表中计算相邻日期。之所以使用jsonb数据类型,是因为它允许在同一个数组中混合各种数据类型:

CREATE OR REPLACE FUNCTION count_date(x jsonb, y jsonb, d date)
RETURNS jsonb LANGUAGE sql AS
$$
SELECT CASE 
WHEN d IS NULL
THEN COALESCE(x,y)
ELSE
to_jsonb(d :: text)
|| CASE
WHEN COALESCE(x,y) = '[]' :: jsonb
THEN '[1]' :: jsonb
WHEN COALESCE(x->>0, y->>0) :: date + 1 = d :: date
THEN jsonb_set(COALESCE(x-0, y-0), '{-1}', to_jsonb(COALESCE(x->>-1, y->>-1) :: integer + 1))
ELSE COALESCE(x-0, y-0) || to_jsonb(1)
END
END ;
$$
DROP AGGREGATE IF EXISTS count_date(jsonb, date) ;
CREATE AGGREGATE count_date(jsonb, date)
(
sfunc = count_date
, stype = jsonb
) ;

然后对表上按id分组的count_date进行迭代:

WITH list AS (
SELECT id, email, count_date('[]', timestamp ORDER BY timestamp :: timestamp) as count_list
FROM your_table
GROUP BY id, email
)
SELECT id, email, jsonb_array_elements(count_list-0) AS length
FROM list

最新更新