在SQL Server中,查询后我有以下数据集。我需要按uid对这些数据进行分组,然后删除"duplicate"。并返回具有最近列d的组。我还需要返回最近的2个重复数据删除组。如果:
,则组被定义为重复组- 它有相同的行数
- A、B、C列相同
- 行顺序相同 2021-02-19 09:00:002021-02-19 10:00:002021-02-19 11:00:002021-01-19 09:00:002021-01-19 10:00:002020-02-19 09:00:002020-02-19 10:00:002020-02-19 11:00:002000-10-05 09:00:00
在下面的查询中,子查询temp为a, B, C创建了一个逗号分隔的列。然后我在a, B, C上使用分区,并在子查询temp2级别上获得基于日期的排名。外部的最后一个查询只提取排名1,并显示表的输出。
select t.Uid, t.A,t.B, t.C , t.D1, t.D2
from (
select Uid, A,B,C, D1, rank() over ( partition by A, B,C order by D1 desc) as rank
from
(SELECT Uid,
A=STUFF
(
(
SELECT ', ' + CAST(A AS VARCHAR(MAX))
FROM Table1 t2
WHERE t2.Uid = t1.Uid
FOR XML PATH('')
),1,1,''
) ,
B=STUFF
(
(
SELECT ', ' + CAST(B AS VARCHAR(MAX))
FROM Table1 t2
WHERE t2.Uid = t1.Uid
FOR XML PATH('')
),1,1,''
) ,
C=STUFF
(
(
SELECT ', ' + CAST(C AS VARCHAR(MAX))
FROM Table1 t2
WHERE t2.Uid = t1.Uid
FOR XML PATH('')
),1,1,''
) ,
cast (max( [D1] ) as date) D1
FROM Table1 t1
GROUP BY Uid ) as temp
) as temp2
join Table1 t on temp2.Uid = t.Uid
and temp2. D1= t.D1
where temp2.rank = 1
这是DB Fiddle的链接:https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=de2127330c2e60d3733bfc9548504142