如果 id 连续 n 行,如何标记一组重复项


如果不

将其分解为多个部分或使用我想避免的光标,在测试中构建连续 id 是很困难的。

伪查询 -

SELECT all 
FROM table with the same description on multiple adjacent rows for >= 4 rows 
and set tag = 'y' and order by id 
(id,description, tag),
(1, 'xxx', 'n'),
(2, 'xxx', 'n'),
(3, 'xxx', 'n'),
(7, 'xxx', 'n'),
(5, 'xxx', 'n'),
(8, 'xxx', 'n'), 
(4, 'xxx', 'n'), 
(6, 'zzz', 'n') 

期望的结果

(1, 'xxx', 'y') 
(2, 'xxx', 'y') 
(3, 'xxx', 'y') 
(4, 'xxx', 'y') 
(5, 'xxx', 'y') 

被称为间隙和孤岛问题。这样的东西应该有效

;with cte as
(SELECT id, 
       description, 
       tag = 'y' ,
       cnt = Count(*)over(partition by description, grp)
FROM  (SELECT *, 
              grp = Sum(CASE WHEN prev_description = description THEN 0 ELSE 1 END)Over(Order by id)
       FROM   (SELECT *, 
                      prev_description = Lag(description) OVER(ORDER BY id) 
               FROM   Yourtable) a) b 
GROUP  BY id, description, grp 
)
Select * from cte 
Where cnt >= 4

使用Row_Number的另一种方法

;with cte as
(SELECT id, 
       description, 
       tag = 'y' ,
       cnt = Count(*)over(partition by description, grp)
FROM  (select Grp = row_number()over(order by id) - 
             row_number()over(partition by description order by id), *
       from Yourtable) b 
GROUP  BY id, description, grp)
Select * from cte 
Where cnt >= 4

我认为这可以做到

select *, 'y' as 'newTag' 
from ( select *
           , count(*) over (partition by [description], grp) as 'grpSize' 
       from ( select * 
                   , ( [id] - row_number() over (partition by [description] order by [id]) ) as grp
              from [consecutive] 
            ) tt
     ) ttt 
where grpSize >= 4
order by [description], grp, [id]

最新更新