我有以下代码来检测单个表中的重复:
UPDATE tab
SET dup = 'Y'
WHERE ROWID IN
(SELECT tab_o.ROWID
FROM tab tab_o,
(SELECT *
FROM tab tab_i
WHERE ROWID IN
(SELECT ROWID
FROM
(SELECT ROWID,
ROW_NUMBER() OVER(PARTITION BY a, b, c ORDER BY a, b, c) dupl
FROM tab
WHERE a IS NOT NULL
AND a = 1
AND b = 1
AND c = 3
)
WHERE dupl > 1
)
) res
WHERE tab_o.a = res.a
AND tab_o.b = res.b
AND tab_o.c = res.c
);
我谷歌搜索了很多网站,发现大多数人都遵循这种有效的方式。但是没有找到有关这些嵌套查询如何工作的正确解释。
使其更简单。无需分析。
样本表:
12:57:37 SYSTEM@dwal> create table dupe_test
2 (a number, b number, c number, is_dupe char);
Table created.
12:57:50 SYSTEM@dwal> insert all
12:57:50 2 into dupe_test values (1, 1, 1, 'n')
12:57:50 3 into dupe_test values (1, 1, 1, 'n')
12:57:50 4 into dupe_test values (1, 1, 1, 'n')
12:57:50 5 into dupe_test values (1, 2, 1, 'n')
12:57:50 6 into dupe_test values (1, 2, 1, 'n')
12:57:50 7 into dupe_test values (1, 2, 1, 'n')
12:57:50 8 select * from dual;
6 rows created.
是:
12:58:17 SYSTEM@dwal> select * from dupe_test;
A B C I
---------- ---------- ---------- -
1 1 1 n
1 1 1 n
1 1 1 n
1 2 1 n
1 2 1 n
1 2 1 n
6 rows selected.
唯一值:
12:59:35 SYSTEM@dwal> select rowid, t.*
2 from dupe_test t
3 where rowid in (select min(rowid)
4 from dupe_test
5 group by a, b, c);
ROWID A B C I
------------------ ---------- ---------- ---------- -
AAARN1AABAAAO9JAAD 1 2 1 n
AAARN1AABAAAO9JAAA 1 1 1 n
更新和结果:
12:59:51 SYSTEM@dwal> update dupe_test t
2 set is_dupe = 'y'
3 where rowid not in (select min(rowid)
4 from dupe_test
5 group by a, b, c);
4 rows updated.
13:00:45 SYSTEM@dwal> select * from dupe_test;
A B C I
---------- ---------- ---------- -
1 1 1 n
1 1 1 y
1 1 1 y
1 2 1 n
1 2 1 y
1 2 1 y
6 rows selected.
更新:
我要做的是我发现重复了表中的一个条目 在同一张表中,所有此类条目将用Dupl标志标记 包括原始条目
仍然不需要分析。只需在您的子查询中添加having count(*) = 1
,因此您只需更新非唯一的行即可。Having
子句基本上是汇总函数的条件,而无需将查询包裹在子查询中。它是最后执行的。
11:03:00 SYSTEM@dwal> insert into dupe_test values (1,3,1,'n') -- add some unique row
11:03:09 2 /
1 row created.
11:03:10 SYSTEM@dwal> update dupe_test set is_dupe = 'y'
11:03:27 2 where rowid not in
11:03:34 3 (select min(rowid) from dupe_test
11:03:51 4 group by a,b,c
11:04:00 5 having count(*) = 1);
6 rows updated.
11:04:06 SYSTEM@dwal> select * from dupe_test;
A B C I
---------- ---------- ---------- -
1 1 1 y
1 1 1 y
1 1 1 y
1 2 1 y
1 2 1 y
1 2 1 y
1 3 1 n
7 rows selected.