从(id,父id)表派生(id,兄弟id)表

  • 本文关键字:id 派生 兄弟 c# mysql ontology
  • 更新时间 :
  • 英文 :


摘要:我有一个概念表concept(id, definition, parent-id, path-to-root),我想创建一个扩展的概念表extended_concept(concept-id, definition, parent-id, siblings-ids),其中siblings-ids是一个逗号分隔的字符串,类似于'sibling-id-1,sibling-id-2,..',最小同级数等于3。


详细信息:我有以下信息:

  1. 表示本体(图(的概念表concepts(id, definition, parent-id, path-to-root)。。它包含10000000条记录

以下是一些示例记录:

('A28681773', "definition2", 'A32452653', 'A24672666.A24681708.A24674976.A27414154.A32452653')
('A33193765', "definition1", 'A24670862', 'A24672666.A24681708.A24674976.A27414154.A27414079.A24679016.A24670862')
  1. 一个C#程序,旨在创建一个包含(concept-id, definition, parent-id, siblings-ids)的大型.csv文件,不包括具有少于3个同级的概念

我认为依赖C#来构建这个.csv文件在时间和空间方面具有很高的复杂性。。我无法想出一个sql查询来构建这样一个表,顺便说一句,只需将表导出为.csv文件就很容易了。

有什么想法吗?

已解决。。

  1. 按父id对概念进行分组,并使用group_concat聚合函数将所有子项放在其父项旁边的逗号分隔字符串中
create view parent as
(
# group_concat is the aggregation function that will concat the fields of Id Column in a comma-separated string
select Parent_Id as Id, group_concat(Id) as kids, count(Id) as kids_count
from concept
group by Parent_Id
having kids_count > 3); # must be > 3, not >= 3, because later we will exclude the concept from being a sibling for itself..
  1. 在概念中加入parentconcept表。Parent_Id=父级。Id.需要一些字符串操作(通过replacetrim函数(来修复当我们将概念排除为其自身的同级时发生的错误
create table extended_concept as
(
select 
concept.Id,
concept.Parent_Id,
concept.Definition,
# excluding the concept from being a sibling to itself will produce errors like: it will convert 'Id1,Id2,Id3' to ',Id2,Id3' or 'Id1,,Id3' or 'Id1,Id2,'
# so to fix it we will trim BOTH leading and trailing',', and replace ',,' with only one ','
TRIM(BOTH ',' from
REPLACE(
REPLACE(parent.kids,
concept.Id,
''),
',,', ',')
)
as siblings,
(parent.kids_count - 1) as siblings_count # because we excluded the concept from being a sibling for itself
from concept
inner join
parent
where concept.Parent_Id = parent.Id
order by siblings_count ASC
); -- count = 9,759
  1. extended_concept导出为.csv文件,然后与父视图一起删除
drop view parent;
drop table extended_concept;

最新更新