在SQL中删除重复组



在SQL Server中,查询后我有以下数据集。我需要按uid对这些数据进行分组,然后删除"duplicate"。并返回具有最近列d的组。我还需要返回最近的2个重复数据删除组。如果:

,则组被定义为重复组
  • 它有相同的行数
  • A、B、C列相同
  • 行顺序相同
  • 2021-02-19 09:00:002021-02-19 10:00:002021-02-19 11:00:002021-01-19 09:00:002021-01-19 10:00:002020-02-19 09:00:002020-02-19 10:00:002020-02-19 11:00:002000-10-05 09:00:00

在下面的查询中,子查询temp为a, B, C创建了一个逗号分隔的列。然后我在a, B, C上使用分区,并在子查询temp2级别上获得基于日期的排名。外部的最后一个查询只提取排名1,并显示表的输出。

select t.Uid, t.A,t.B, t.C , t.D1, t.D2
from (
select Uid, A,B,C, D1, rank() over ( partition by A, B,C order by D1 desc) as rank
from 
(SELECT Uid,  
A=STUFF  
(  
(  
SELECT  ', ' + CAST(A AS VARCHAR(MAX))  
FROM Table1 t2   
WHERE t2.Uid = t1.Uid   
FOR XML PATH('')  
),1,1,''  
)  ,
B=STUFF  
(  
(              
SELECT  ', ' + CAST(B AS VARCHAR(MAX))  
FROM Table1 t2   
WHERE t2.Uid = t1.Uid   
FOR XML PATH('')  
),1,1,''  
)  ,
C=STUFF  
(  
(  
SELECT  ', ' + CAST(C AS VARCHAR(MAX))  
FROM Table1 t2   
WHERE t2.Uid = t1.Uid   
FOR XML PATH('')  
),1,1,''  
)  ,
cast (max( [D1] ) as date) D1
FROM Table1 t1  
GROUP BY Uid  ) as temp 
) as temp2
join Table1 t on temp2.Uid = t.Uid
and temp2. D1= t.D1
where temp2.rank = 1

这是DB Fiddle的链接:https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=de2127330c2e60d3733bfc9548504142

最新更新