sqlite计算连续匹配行的数量,同时排除它们



更新

即使有一个公认的答案,我仍然对其他建议持开放态度。我需要这一点来处理sqlite,它至少可以追溯到3.19.4版本(实际上是Android 8(,并且在sqlite的3.28版本之前,最高效的可接受答案形式(使用窗口函数(是不可用的。当被查询的表包含数百行时,回退会导致设备停滞,然后崩溃,所以我不能依赖这些。


原始问题

假设我有一个名为messages的sqlite表,其中包含以下列:

| id | type | text | time |
---------------------------

id是主键并且是唯一的。想象一下,我有5行,顺序如下(为了清晰起见,将它们表示为JSON数组(:

[
{
id: 'first',
type: 'random',
text: 'hey there',
time: '2022-02-15T01:47:25.581'
},
{
id: 'second',
type: 'new_item',
text: 'new shoe',
time: '2022-02-15T01:48:25.581'
}
{
id: 'third',
type: 'new_item',
text: 'new socks',
time: '2022-02-15T01:49:25.581'
}
{
id: 'fourth',
type: 'random',
text: 'what time is it',
time: '2022-02-15T01:50:25.581'
},
{
id: 'fifth',
type: 'new_item',
text: 'new shoe',
time: '2022-02-15T01:51:25.581'
}

我想查询这些消息,以便将类型为new_item的连续行表示为一行,以及最终输出中每个唯一new_item的连续行数。具体地说,我希望输出能给我以下捕获的信息(不必是相同的模式,这只是我想要的一个例子(:

[
{
id: 'first',
type: 'random',
text: 'hey there',
time: '2022-02-15T01:47:25.581'
},
{
id: 'second',
type: 'new_item',
text: 'new shoe',
time: '2022-02-15T01:48:25.581',
numConsecutiveItems: 2
}
{
id: 'fourth',
type: 'random',
text: 'what time is it',
time: '2022-02-15T01:50:25.581'
},
{
id: 'fifth',
type: 'new_item',
text: 'new shoe',
time: '2022-02-15T01:51:25.581',
numConsecutiveItems: 1
}

请注意,id为third的消息不在最终输出中,因为它的类型为new_item,并且连续地跟在另一个类型为new_item的消息之后,而id为second的消息的numConsecutiveItems2也是出于同样的原因。更重要的是,id为fifth的消息之所以存在,是因为它没有立即跟随另一个new_item消息,并且出于同样的原因,它的值numConsecutiveItems1。我可以通过一个查询并按time列排序来实现这一点吗?这将是我的强烈偏好,但如果不是,那么理想情况下不超过2个查询。非常感谢。

使用窗口函数创建连续类型的组,并计算每组中'new_item'的数量:

WITH cte AS (
SELECT *, 
COUNT(*) OVER (PARTITION BY grp) count,
ROW_NUMBER() OVER (PARTITION BY grp ORDER BY time) rn 
FROM (       
SELECT *, SUM(flag) OVER (ORDER BY time) grp
FROM (
SELECT *, (type <> LAG(type, 1, '') OVER (ORDER BY time)) flag
FROM tablename
)
)  
)
SELECT id, type, text, time,
CASE WHEN type = 'new_item' THEN count END numConsecutiveItems
FROM cte
WHERE numConsecutiveItems IS NULL OR rn = 1
ORDER BY time;

对于不支持窗口函数的SQLite版本,请使用聚合和相关子查询来模拟窗口函数:

WITH 
prev_types AS (
SELECT t1.*, t1.type <> COALESCE(t2.type, '') flag, MAX(t2.time) max_time 
FROM tablename t1 LEFT JOIN tablename t2
ON t2.time < t1.time
GROUP BY t1.id
),
sum_flags AS (
SELECT pt1.*, SUM(pt2.flag) grp
FROM prev_types pt1 INNER JOIN prev_types pt2
ON pt2.time <= pt1.time
GROUP BY pt1.id
),
cte AS (
SELECT sf1.*, 
(SELECT COUNT(*) FROM sum_flags sf2 WHERE sf2.grp = sf1.grp) count,
(SELECT COUNT(*) FROM sum_flags sf2 WHERE sf2.grp = sf1.grp AND sf2.time <= sf1.time) rn
FROM sum_flags sf1
)
SELECT id, type, text, time,
CASE WHEN type = 'new_item' THEN count END numConsecutiveItems
FROM cte
WHERE numConsecutiveItems IS NULL OR rn = 1
ORDER BY time;

请参阅演示

相关内容

  • 没有找到相关文章

最新更新