SQL/Python-递归连接id列



我有一个SQL表,它有两列——a_id和b_id(a_id与b_id有多对多关系(。我正在尝试将所有a_id与所有链接的a_id和b_id连接起来。例如

a_id b_id
A 1
B 2
C 2
A 3
D 3

识别点簇是一个走图问题,在SQL中需要递归CTE。您将首先生成所有边,然后遍历图形,同时跟踪已经访问过的边,最后通过最小的公共边来识别每组:

with 
edges as (
select t1.a_id as a_id1, t2.a_id as a_id2
from temp_data t1
inner join temp_data t2 on t1.b_id = t2.b_id
),
cte as (
select a_id1, a_id2, array_construct(a_id1) as visited 
from edges 
where a_id1 = a_id2
union all
select c.a_id1, e.a_id2, array_append(c.visited, e.a_id2)
from cte c
inner join edges e on e.a_id1 = c.a_id2
where not array_contains(e.a_id2, c.visited)
)
select a_id1, dense_rank() over(order by min(a_id2)) as grp
from cte
group by a_id1
order by grp, a_id1

这将每个a_id和它所属的组放在一个单独的行中。如果你想将结果作为一个数组,以及对应的b_id,你可以将外部查询更改为:

with ...
select array_agg(distinct t.a_id) as a_ids, array_agg(distinct t.b_id) as b_ids
from temp_data t
inner join (
select a_id1, dense_rank() over(order by min(a_id2)) as grp
from cte
group by a_id1
) x on x.a_id1 = t.a_id
group by x.grp

相关内容

  • 没有找到相关文章

最新更新