我有一个包含员工的staging表,其中列有firstname, middlename, lastname, department, effectivedate, canceldate
和processdate
我必须将每一行与其他行进行比较以查找重复,如果两行匹配,则必须选择具有更大进程日期的行。
我正在使用CTE与Dense_rank
函数查找重复项,但我不知道如何比较同一表中的行。
Thanks in advance.
这将查找具有更大处理日期的重复记录
select s1.*
from staging s1
join staging s2
on s1.firstname = s2.firstname
and s1.middlename = s2.middlename
and s1.lastname = s2.lastname
and s1.department = s2.department
-- compare other columns that make records "duplicates" as appropriate
and s1.processdate > s2.processdate; -- this makes the s1 record the latest