我的数据如下:
|ID|Name |LinkType|LinkID |
|--|-----|--------|-------|
|1 |item1|Task |12345 |
|1 |item1|Task |22347 |
|1 |item1|Event |21342 |
|2 |item2|Task |24533 |
|2 |item2|Event |43532 |
|2 |item2|Event |457 |
我正在尝试将其转换为以下格式:
|ID|Name |Tasks |Events |
|--|-----|-----------|---------|
|1 |item1|12345,22347|21342 |
|2 |item2|24533 |43532,457|
我有一个正在运行的SQL查询,但我想知道是否有任何方法可以优化它,因为它目前非常慢。我的数据集是大约800k行,具有大约600k个唯一ID。
SELECT ID
,Name
,(
SELECT LISTAGG(Attachments2.LinkID, ',') within
GROUP (
ORDER BY Attachments2.LinkID
) LinkIDs
FROM Attachments Attachments2
WHERE Attachments.ID = Attachments2.ID
AND LinkType = 'Task'
GROUP BY Attachments2.ID
) Tasks
,(
SELECT LISTAGG(Attachments3.LinkID, ',') within
GROUP (
ORDER BY Attachments3.LinkID
) LinkIDs
FROM Attachments Attachments3
WHERE Attachments.ID = Attachments3.ID
AND LinkType = 'Event'
GROUP BY Attachments3.ID
) Events
FROM Attachments
GROUP BY ID
,Name
我总共有4种不同的LinkType需要进行分组,这种方法非常慢。我在5分钟内得到500个结果,所以得到600k行将花费太多时间。
您正在寻找条件聚合(聚合函数中的CASE WHEN
(:
SELECT
id,
name,
LISTAGG(CASE WHEN linktype = 'Task' THEN linkid END, ',')
WITHIN GROUP (ORDER BY linkid) AS task_ids,
LISTAGG(CASE WHEN linktype = 'Event' THEN linkid END, ',')
WITHIN GROUP (ORDER BY linkid) AS event_ids
FROM attachments
GROUP BY id, name
ORDER BY id, name;
当然,您应该在(id, linktype)
上有一个唯一的索引。确保它是按这个顺序排列的,即先有项目id,然后是链接类型,这样你就可以快速访问项目的数据。另请参阅您请求的评论部分中我的其他评论。(