我很长一段时间以来一直在努力从PostgreSQL中获取一组数据。
我在Docker中使用PostgreSQL 11,表events
包含:
Column | type | Modifiers
-----------|-----------|-------------
id | text | primary key
client_id | text | foreign key
created_at | timestamp | not null
context | []text | not null
目标是为client_id的子集以及上下文的子集获取最后插入的事件。性能非常重要,所以我只想使用SQL。
到目前为止,我探索了一种使用LATERAL
的解决方案,但效果不佳:
SELECT e2.*
FROM (
SELECT events.client_id, events.context
FROM events
WHERE
events.client_id IN (?) AND
events.context && ?
GROUP BY (events.client_id, events.context)
) e1 LEFT JOIN LATERAL (
SELECT *
FROM events
WHERE
events.client_id = e1.client_id AND
events.context = e1.context
ORDER BY events.created_at DESC LIMIT 1
) e2 ON true;
这样做的主要问题是,当给定的上下文文本存在于两个不同的上下文中时,此查询将同时返回两个事件,但我只想要一个。我想要每个client_id
的N
行,其中N
在0(无事件(和参数中给定上下文的长度之间。
这些行的示例:
created_at | client_id | context
----------------| -------------|------------
5pm:15 | accb | ['home']
5pm:15 | baac | ['home']
5pm:20 | accb | ['home','shopping_cart']
5pm:25 | accb | ['shopping_cart','payment']
5pm:30 | accb | ['disconnect']
5pm:30 | baac | ['home','shopping_cart','payment']
5pm:35 | baac | ['disconnect']
参数['accb','baac']
表示客户端ID,参数['home','shopping_cart']
表示上下文,因此我想要:
created_at | client_id | context
----------------| -------------|------------
5pm:20 | accb | ['home','shopping_cart']
5pm:25 | accb | ['shopping_cart','payment']
5pm:30 | baac | ['home','shopping_cart','payment']
你有什么解决方案吗?
如果每个client_id
需要一行,我建议使用distinct on
:
select distinct on (e.client_id) e.*
from events e
where e.client_id in (?) and
e.context && ?
order by e.client_id, e.created_at desc;
一位同事找到了一个解决方案:
WITH ranked_events AS (
SELECT
events.*,
ROW_NUMBER() OVER (PARTITION BY client_id ORDER BY created_at DESC) AS pos
FROM
events
WHERE
context @> ? AND
client_id IN (?)
)
SELECT
*
FROM
ranked_events
WHERE
pos = 1