我正在尝试识别一系列数字(SQL Server)中的差距。我的场景如下。。。
ID Start End
1 1 4
2 1 6
3 2 4
4 8 10
5 13 14
Visual
-------------------------------
1-2-3-4
1-2-3-4-5-6
2-3-4
- -8-9-10
- - -13-14
这样做的结果可能是:
Table
-------------------------------
ID Start End Gap
4 8 10 -1
5 13 14 -2
最终,我想知道差距的范围,但我应该能够从上面弄清楚。。。
Missing
7
11-12
我提出的解决方案要么太慢,要么没有考虑到范围重叠(例如ID 2)
CREATE TABLE #Docs (
[Rank] INT, --DENSE_RANK () OVER(ORDER BY BegProd)
ControlNumber BIGINT,
BegProd INT,
EndProd INT
)
SELECT
T1.ControlNumber,
T1.BegProd,
T1.EndProd,
MAX(T2.EndProd) AS [PreviousEndProd],
[Gap] = T1.BegProd - MAX(T2.EndProd) - 1
FROM #Docs T1
INNER JOIN #Docs T2
ON T1.[Rank] = T2.[Rank] + 1
AND T1.EndProd > T2.EndProd
GROUP BY T1.ControlNumber, T1.BegProd, T1.EndProd
HAVING T1.BegProd - MAX(T2.EndProd) > 1
该表中有200多万行,的范围从10亿到10亿
编辑修复了"丢失"表。间隙列指示在该起始数字之前有多少间隙。(不包括缺失的#7是1个数字)
试试这个:
create table #docs(id int, start int, [end] int)
insert #docs values(1,1,4),(2,1,6),(3,2,4),(4,8,10),(5,13,14)
;with a as
(
select start, dense_rank() over (order by start) rn
from #docs t where not exists (select 1 from #docs where t.start > start and t.start < [end])
group by start
), b as
(
select [end], dense_rank() over (order by [end]) rn
from #docs t where not exists (select 1 from #docs where t.[end] > start and t.[end] < [end])
group by [end]
)
select
case when a.[start]= b.[end]+2 then cast(a.start-1 as varchar(21))
else cast(b.[end]+1 as varchar(10)) +'-' + cast(a.start - 1 as varchar(10)) end missing
from a join b on a.rn - 1 = b.rn
and a.[start] <> b.[end] + 1
结果:
Missing
7
11-12