我有一个简单的表,如下所示:
ClientID ItemID
1 1
1 2
1 3
2 1
2 2
3 3
4 3
5 1
5 2
5 4
5 5
其中两列组合为主键。我现在的任务是识别分配给ClientID的所有唯一ItemID集。所以在我的例子中,集合是:
ItemIDs 1,2,3 (used by ClientID 1)
ItemIDs 1,2 (used by ClientID 2)
ItemIDs 3 (used by ClientIDs 3 and 4)
ItemIDs 1,2,4,5 (used by ClientID 5)
理想情况下,输出将是两个表:
SetID ItemID
1 1
1 2
1 3
2 1
2 2
3 3
4 1
4 2
4 4
4 5
ClientID SetID
1 1
2 2
3 3
4 3
5 4
其中SetID将是用于其他地方的新字段。
目前,我识别唯一集的方法包括使用光标为每个ClientID构建一个有序ItemID的字符串,然后比较输出以获得唯一字符串,最后将其解析回来。它写得很快,但感觉很糟糕。
我相信肯定有比这更好的方法。有什么想法吗?
-- Table to hold test data
declare @T table
(
ClientID int,
ItemID int
)
insert into @T values
(1, 1),(1, 2),(1, 3),
(2, 1),(2, 2),
(3, 3),(4, 3),
(5, 1),(5, 2),(5, 4),(5, 5)
-- Temp table that will hold the generated set's
declare @Tmp table
(
ClientID int,
ItemIDSet varchar(max),
SetID int
)
-- Query the sets using rank() over a comma separated ItemIDSet
insert into @Tmp
select ClientID,
ItemIDSet,
rank() over(order by ItemIDSet) as SetID
from (
select T1.ClientID,
stuff((select ','+cast(T2.ItemID as varchar(10))
from @T as T2
where T1.ClientID = T2.ClientID
order by T2.ItemID
for xml path('')), 1, 1, '') as ItemIDSet
from @T as T1
group by T1.ClientID
) as T
-- Get ClientID and SetID from @Tmp
select ClientID,
SetID
from @Tmp
order by ClientID
-- Get SetID and ItemID from @Tmp
select SetID,
T3.N.value('.', 'int') as ItemID
from (
select distinct
SetID,
'<i>'+replace(ItemIDSet, ',', '</i><i>')+'</i>' as ItemIDSet
from @Tmp
) as T1
cross apply
(
select cast(T1.ItemIDSet as xml) as ItemIDSet
) as T2
cross apply T2.ItemIDSet.nodes('i') as T3(N)
结果:
ClientID SetID
----------- -----------
1 2
2 1
3 4
4 4
5 3
SetID ItemID
----------- -----------
1 1
1 2
2 1
2 2
2 3
3 1
3 2
3 4
3 5
4 3
SetID的值与您提供的输出中的值不完全相同,但我认为这不会是一个大问题。SetID由ItemIDSet排序的秩函数rank() over(order by ItemIDSet)
生成。
拿着它转一圈。