我到处都在准备如何解决我的用例,我想出了以下方法来消除我正在执行的查询的重复数据。
基本上,在下面的代码中,在链接表中,我们有节点和控件。
控件是通过方向性将节点链接在一起的东西以Type表示。
然而,每一行只提供一半的信息。换句话说,每一行是节点和控件之间的链接。
我正在尝试创建一个表,其中既有节点,又有控件在中间。
问题是重复,即重复。
如果我们有
Protein01 Biding Ntype0
Molecule01 Binding Ntype 1
最后我想要一排
Protein01 Biding Ntype0 Molecule01 Ntype 1
问题是在这里给出了自加入,我得到了
Protein01 Biding Ntype0 Molecule01 Ntype 1
Molecule01 Biding Ntype1 Protein01 Ntype 0
我需要消除其中一行,我写了下面的代码。但后来我犯了一个严重的错误。
Ora-Ora-01445:无法从联接视图中选择ROWID或对其进行采样没有密钥保留表
我做错了什么?
With dup AS
(
select "Links"."idControl",
"ControlTypes"."sObjectTypeName" as "ControlType",
"Links"."idNode",
"NodeTypes"."sObjectTypeName" as "NodeType",
"Links"."nLinkType" as "nLinkType1",
"Links2"."idNode" as "idNode2",
"Links2"."sObjectTypeName" as "Node2Type",
"Links2"."nLinkType" as "nLinkType2"
from "Links"
join "Nodes" on "Links"."idNode" = "Nodes".ID
join "Controls" on "Links"."idControl" = "Controls".ID
join "ObjectTypes" "NodeTypes" on "Nodes"."idObjectType" = "NodeTypes".ID
join "ObjectTypes" "ControlTypes" on "Controls"."idObjectType" = "ControlTypes"."ID"
join (select "Links2"."idNode",
"Links2"."idControl",
"NodeTypes2"."sObjectTypeName",
"Links2"."nLinkType"
from "Links" "Links2"
join "Nodes" "Nodes2" on "Links2"."idNode" = "Nodes2".ID
join "Controls" "Controls2" on "Links2"."idControl" = "Controls2".ID
join "ObjectTypes" "NodeTypes2" on "Nodes2"."idObjectType" = "NodeTypes2".ID
join "ObjectTypes" "ControlTypes2" on "Controls2"."idObjectType" = "ControlTypes2"."ID"
) "Links2" On "Links2"."idControl" = "Links"."idControl" and "Links2"."idNode" != "Links"."idNode"
)
select * from dup a where rowid in ( select max(rowid) from dup b where a."idNode" = b."idNode2");
请注意,链接表有一个主键Links.ID
您的示例视图非常大,并且示例数据非常不清楚,所以我将使用一些合成数据和视图。
要对行进行重复数据消除,只需要使用其他标识符而不是rowid。此外,您还可以在视图上应用一些分析函数来确定要离开哪一行。从这个问题来看,你似乎需要留下随机的一排。在这种情况下,您可以使用row_number
在应进行重复数据消除的列组中创建标识符。假设您拥有列为key1, key2, tail1, tail2, tail3
的视图,并且key1, key2
的组合应该是唯一的。在这种情况下,您可以应用
with data_view as (
select 1 key1, 1 key2, 1 tail1, 1 tail2, 1 tail3 from dual
union all
select 1, 1, 2, 2, 2 from dual
union all
select 1, 1, 3, 3, 3 from dual
union all
select 2, 2, 1, 2, 3 from dual
union all
select 2, 2, 3, 2, 1 from dual
union all
select 2, 2, 2, 2, 2 from dual
),
dub as (
select row_number() over(partition by key1, key2 order by tail1) rn,
key1, key2, tail1, tail2, tail3
from data_view
)
select key1, key2, tail1, tail2, tail3 from dub a where rn = 1
另一种选择是保持代码的问题,您只需在视图中使用rownum
伪列来创建唯一的行标识符,然后使用它而不是像那样使用rowid
with data_view as (
select 1 key1, 1 key2, 1 tail1, 1 tail2, 1 tail3 from dual
union all
select 1, 1, 2, 2, 2 from dual
union all
select 1, 1, 3, 3, 3 from dual
union all
select 2, 2, 1, 2, 3 from dual
union all
select 2, 2, 3, 2, 1 from dual
union all
select 2, 2, 2, 2, 2 from dual
),
dub as (
select rownum rn,
key1, key2, tail1, tail2, tail3
from data_view
)
select * from dub a
where rn in (select max(rn) from dub b where a.key1 = b.key1 and a.key2 = b.key2)
另请参阅小提琴