更新
即使有一个公认的答案,我仍然对其他建议持开放态度。我需要这一点来处理sqlite,它至少可以追溯到3.19.4
版本(实际上是Android 8(,并且在sqlite的3.28
版本之前,最高效的可接受答案形式(使用窗口函数(是不可用的。当被查询的表包含数百行时,回退会导致设备停滞,然后崩溃,所以我不能依赖这些。
原始问题
假设我有一个名为messages
的sqlite表,其中包含以下列:
| id | type | text | time |
---------------------------
id
是主键并且是唯一的。想象一下,我有5行,顺序如下(为了清晰起见,将它们表示为JSON数组(:
[
{
id: 'first',
type: 'random',
text: 'hey there',
time: '2022-02-15T01:47:25.581'
},
{
id: 'second',
type: 'new_item',
text: 'new shoe',
time: '2022-02-15T01:48:25.581'
}
{
id: 'third',
type: 'new_item',
text: 'new socks',
time: '2022-02-15T01:49:25.581'
}
{
id: 'fourth',
type: 'random',
text: 'what time is it',
time: '2022-02-15T01:50:25.581'
},
{
id: 'fifth',
type: 'new_item',
text: 'new shoe',
time: '2022-02-15T01:51:25.581'
}
我想查询这些消息,以便将类型为new_item
的连续行表示为一行,以及最终输出中每个唯一new_item
的连续行数。具体地说,我希望输出能给我以下捕获的信息(不必是相同的模式,这只是我想要的一个例子(:
[
{
id: 'first',
type: 'random',
text: 'hey there',
time: '2022-02-15T01:47:25.581'
},
{
id: 'second',
type: 'new_item',
text: 'new shoe',
time: '2022-02-15T01:48:25.581',
numConsecutiveItems: 2
}
{
id: 'fourth',
type: 'random',
text: 'what time is it',
time: '2022-02-15T01:50:25.581'
},
{
id: 'fifth',
type: 'new_item',
text: 'new shoe',
time: '2022-02-15T01:51:25.581',
numConsecutiveItems: 1
}
请注意,id为third
的消息不在最终输出中,因为它的类型为new_item
,并且连续地跟在另一个类型为new_item
的消息之后,而id为second
的消息的numConsecutiveItems
为2
也是出于同样的原因。更重要的是,id为fifth
的消息之所以存在,是因为它没有立即跟随另一个new_item
消息,并且出于同样的原因,它的值numConsecutiveItems
是1
。我可以通过一个查询并按time
列排序来实现这一点吗?这将是我的强烈偏好,但如果不是,那么理想情况下不超过2个查询。非常感谢。
使用窗口函数创建连续类型的组,并计算每组中'new_item'
的数量:
WITH cte AS (
SELECT *,
COUNT(*) OVER (PARTITION BY grp) count,
ROW_NUMBER() OVER (PARTITION BY grp ORDER BY time) rn
FROM (
SELECT *, SUM(flag) OVER (ORDER BY time) grp
FROM (
SELECT *, (type <> LAG(type, 1, '') OVER (ORDER BY time)) flag
FROM tablename
)
)
)
SELECT id, type, text, time,
CASE WHEN type = 'new_item' THEN count END numConsecutiveItems
FROM cte
WHERE numConsecutiveItems IS NULL OR rn = 1
ORDER BY time;
对于不支持窗口函数的SQLite版本,请使用聚合和相关子查询来模拟窗口函数:
WITH
prev_types AS (
SELECT t1.*, t1.type <> COALESCE(t2.type, '') flag, MAX(t2.time) max_time
FROM tablename t1 LEFT JOIN tablename t2
ON t2.time < t1.time
GROUP BY t1.id
),
sum_flags AS (
SELECT pt1.*, SUM(pt2.flag) grp
FROM prev_types pt1 INNER JOIN prev_types pt2
ON pt2.time <= pt1.time
GROUP BY pt1.id
),
cte AS (
SELECT sf1.*,
(SELECT COUNT(*) FROM sum_flags sf2 WHERE sf2.grp = sf1.grp) count,
(SELECT COUNT(*) FROM sum_flags sf2 WHERE sf2.grp = sf1.grp AND sf2.time <= sf1.time) rn
FROM sum_flags sf1
)
SELECT id, type, text, time,
CASE WHEN type = 'new_item' THEN count END numConsecutiveItems
FROM cte
WHERE numConsecutiveItems IS NULL OR rn = 1
ORDER BY time;
请参阅演示