我在PostgreSQL 11.0 中有以下表格
entry_id num pathway_name
hsa00010 {00010} Glycolysis / Gluconeogenesis - Homo sapiens (human)
mmu00010 {00010} Glycolysis / Gluconeogenesis - Mus musculus (mouse)
rno00010 {00010} Glycolysis / Gluconeogenesis - Rattus norvegicus (rat)
hsa00011 {00011} Glycolysis / Gluconeogenesis - Homo sapiens (human)
mmu00011 {00011} Glycolysis / Gluconeogenesis - Mus musculus (mouse)
rno00011 {00011} Glycolysis / Gluconeogenesis - Rattus norvegicus (rat)
rno00012 {00012} Glycolysis / Gluconeogenesis - Rattus norvegicus (rat)
我想删除基于列"num"的重复行,并保留以"hsa"开头的entry_id
预期输出为:
entry_id num pathway_name
hsa00010 {00010} Glycolysis / Gluconeogenesis - Homo sapiens (human)
hsa00011 {00011} Glycolysis / Gluconeogenesis - Homo sapiens (human)
rno00012 {00012} Glycolysis / Gluconeogenesis - Rattus norvegicus (rat)
我尝试使用"distinct on(num("筛选行,但无法选择hsa entry_id。有没有办法选择带有hsa id的行并删除重复项?
非常感谢您的帮助!!
DISTINCT ON
和适当的ORDER BY
应该可以做到这一点:
SELECT DISTINCT ON (entry_id) *
FROM atable
ORDER BY entry_id, NOT starts_with(entry_id, 'hsa');
这依赖于CCD_ 3<TRUE
。