如何在维度表上随机排序唯一键值?



使用下表定义:

CREATE TABLE dbo.Test 
(
[ID] INT NOT NULL IDENTITY(1, 1) PRIMARY KEY,
[Code] CHAR(10) NOT NULL UNIQUE NONCLUSTERED
);

为了对数据集进行去标识化,我需要随机化代码值。

我想保留 ID 值,以便其他表也不需要更改。

我尝试了以下方法:

SELECT t.ID, c.Code
FROM dbo.Test AS t
CROSS APPLY (SELECT TOP(1) v.Code
FROM dbo.Test AS v
WHERE v.Code <> t.Code
ORDER BY NEWID()) AS c

问题是将相同的唯一键值分配给多个记录。

关于如何解决这个问题的任何建议?

这是我的做法:

-- Sample data
DECLARE @test TABLE (ID INT IDENTITY, [Code] CHAR(10));
INSERT @test
VALUES (REPLICATE('A',10)),(REPLICATE('B',10)),(REPLICATE('C',10)),(REPLICATE('D',10));
-- Solution
SELECT t.ID, c.[Code]
FROM   @test AS t
JOIN   
(
SELECT SK = ROW_NUMBER() OVER (ORDER BY NEWID()), t.[Code]
FROM   @test AS t
) AS c ON c.SK = t.ID;

返回:

ID          Code
----------- ----------
1           DDDDDDDDDD
2           AAAAAAAAAA
3           BBBBBBBBBB
4           CCCCCCCCCC

(但你的结果可能不同 - 这就是重点(

快速更新:

请注意,如果您的 ID 不是连续的(例如,某些 ID 已被删除(,那么您可以考虑:

WITH T1 AS (SELECT t.ID, SK = ROW_NUMBER() OVER (ORDER BY t.ID) FROM @test AS t),
T2 AS (SELECT SK = ROW_NUMBER() OVER (ORDER BY NEWID()), t.[Code] FROM @test AS t)
SELECT t1.ID, t2.Code
FROM T1
JOIN T2 ON T1.SK = T2.SK;

最新更新