我有一个表,其中的数据如下:
日志表:
用户Id | 登录日期 | |
---|---|---|
1 | 2022-01-03 | |
1 | 2022-01-04 | |
1 | 2022-01-10 | |
1 | 2022-01-1 | |
1 | 2022-012 | |
1 | 2022-021-23 | |
1 | 2022-021-25 | |
1 | 2022-021-26 | |
1 | 2022-021-27 | |
1 | 2022-021-28 |
这个问题是缺口和孤岛的变体,孤岛是每个用户具有连续日期的记录集群。这里有一种使用分析函数的方法:
WITH cte AS (
SELECT *, CASE WHEN julianday(LoginDate) -
julianday(LAG(LoginDate) OVER (PARTITION BY UserID
ORDER BY LoginDate))
> 1 THEN 1 ELSE 0 END AS counter
FROM yourTable
),
cte2 AS (
SELECT *, SUM(counter) OVER (PARTITION BY UserID ORDER BY LoginDate) AS grp
FROM cte
)
SELECT UserID, LoginDate
FROM cte2 t1
WHERE LoginDate < '2022-01-29' AND
grp = (SELECT t2.grp FROM cte2 t2
WHERE t2.UserID = t1.UserID AND t2.LoginDate = '2022-01-28');
演示
这两个CTE为每个用户的每个集群生成一个伪日期组。最后的查询返回小于目标日期的所有记录,其中组值与前一日期相同。因此,对于给定用户没有即时记录的日期,查询将返回空集。
使用递归CTE:
WITH cte(UserId, LoginDate) AS (
SELECT :var_user_id, :var_date
UNION ALL
SELECT UserId, date(c.LoginDate, '-1 day')
FROM cte c
WHERE EXISTS (SELECT 1 FROM tablename t WHERE t.UserId = c.UserId AND t.LoginDate = date(c.LoginDate, '-1 day'))
)
SELECT *
FROM cte
WHERE LoginDate < (SELECT MAX(LoginDate) FROM cte);
将:var_user_id
和:var_date
更改为用户id和日期所需的值
请参阅演示