我有一个SQL表,它有两列——a_id和b_id(a_id与b_id有多对多关系(。我正在尝试将所有a_id与所有链接的a_id和b_id连接起来。例如
a_id | b_id |
---|---|
A | 1 |
B | 2 |
C | 2 |
A | 3 |
D | 3 |
识别点簇是一个走图问题,在SQL中需要递归CTE。您将首先生成所有边,然后遍历图形,同时跟踪已经访问过的边,最后通过最小的公共边来识别每组:
with
edges as (
select t1.a_id as a_id1, t2.a_id as a_id2
from temp_data t1
inner join temp_data t2 on t1.b_id = t2.b_id
),
cte as (
select a_id1, a_id2, array_construct(a_id1) as visited
from edges
where a_id1 = a_id2
union all
select c.a_id1, e.a_id2, array_append(c.visited, e.a_id2)
from cte c
inner join edges e on e.a_id1 = c.a_id2
where not array_contains(e.a_id2, c.visited)
)
select a_id1, dense_rank() over(order by min(a_id2)) as grp
from cte
group by a_id1
order by grp, a_id1
这将每个a_id
和它所属的组放在一个单独的行中。如果你想将结果作为一个数组,以及对应的b_id
,你可以将外部查询更改为:
with ...
select array_agg(distinct t.a_id) as a_ids, array_agg(distinct t.b_id) as b_ids
from temp_data t
inner join (
select a_id1, dense_rank() over(order by min(a_id2)) as grp
from cte
group by a_id1
) x on x.a_id1 = t.a_id
group by x.grp