为什么在删除重复记录时，为什么有Max(ROWID)或Min(ROWID)

我们可以删除重复记录而不使用pseudo coloumn rowid ...在删除Duplicte记录时，Max(RowID(/Min(rowID(的含义是什么？

ROWID是Oracle用于定位物理记录的内部行标识符。因此，即使您可能已经重复了" ID"值，但每个记录ROWID仍然是唯一的。

create table prices(
   id       varchar2(15) not null
  ,price    number       not null
  ,upd_date date         not null
--  ,primary key(id)
);
ROWID                ID PRICE UPD_DATE
------------------   -- ----- ----------
AChTgbADaAAFgxYAAA   A  7     2018-04-10
AChTgbADaAAFgxYAAB   B  8     2018-04-09
AChTgbADaAAFgxYAAC   B  8     2018-04-09
AChTgbADaAAFgxYAAD   B  8     2018-04-09
AChTgbADaAAFgxYAAE   C  9     2018-04-06
AChTgbADaAAFgxYAAF   C  8     2018-04-05
AChTgbADaAAFgxYAAG   C  7     2018-04-04

组中的max(rowID(通常是通常最近插入的记录，但是这个假设经常是错误的，以至于无法依靠生产代码。它只能依赖于删除a perfect重复。完美的重复是select distinct *产生一份记录的其中。对于所有其他用途，您需要歧视。鉴别列可用于分开两个记录，例如，更新日期表示修改时间。

如果您使用典型的ROWID方法来删除我的示例表，则将错误删除最新的价格9(如UPD_DATE所证明(。

delete
  from prices
 where rowid not in(
        select max(rowid)
          from prices
      group by id);

一种更好的方法是首先使用鉴别器，然后在最后的度假胜地使用rowID。

delete 
  from prices
 where rowid in(
        select rid
          from (select rowid as rid
                      ,row_number() over(            -- Assign a sequence number
                          partition by id            -- Group rows by ID
                              order by upd_date desc -- Sort them by upd_date first  
                                      ,rowid desc    -- Then by ROWID
                      ) as rn
                   from prices
               )
     -- The most recent record will be rn = 1.
     -- The second most recent record will be rn = 2, etcetera
        where rn > 1 -- Select only the duplicates ("after" the most recent one record
       );

如果整个行被重复，并且您想删除除一个副本以外的所有副本，则SQL中没有简单的方法可以在不使用系统生成的行地址的情况下选择要删除的行。

以Ronnis'PRICES表为例，我们看到B的三行是确切的重复：

ID PRICE UPD_DATE
-- ----- -----------
A      7 10/04/2018
B      8 09/04/2018
B      8 09/04/2018
B      8 09/04/2018
C      7 04/04/2018
C      8 05/04/2018
C      9 06/04/2018

尽管我们可能会使用

之类的东西

delete prices where id = 'B' and rownum <= 2;

这不是一个好的解决方案，因为我们必须知道ID和计数，并且一次应用于一个ID。

我们可以使用PL/SQL明确指定ROWID的情况下删除它们：

declare
    cursor c_prices is
        select id, price
              , row_number() over (partition by id order by upd_date desc) as seq
        from   prices
        for update;
begin
    for r in c_prices
    loop
        if r.seq > 1 then
            delete prices where current of c_prices;
        end if;
    end loop;
end;

当然在内部使用where current of语法正在使用ROWID。

明确使用RowID使它更简单：

delete prices where rowid in
( select lag(rowid) over (partition by id order by upd_date) from prices );

这以日期顺序找到所有"以前的"行，并删除相应的行。每个集合中的最后一行不会出现在该列表中，因此不会删除。

相关内容

最新更新

热门标签：