我有一组记录(表[#tmp_origin](,其中包含字符串字段中的重复条目([名称](。我想将 [#tmp_origin] 的全部内容插入到目标表 [#tmp_destination] 中,该表不允许重复并且可能已经包含项目。
如果源表中的字符串在目标表中不存在,则 in 将按原样插入到目标表中。如果目标表中的条目已存在,其值与原始表中的条目值相同,则必须先在字符串后附加一个字符串化的增量数字,然后才能将其插入目标表中。
在此示例脚本中,以这种方式移动数据的过程已使用游标实现:
-- create initial situation (origin and destination table, both containing items)
-- Begin
CREATE TABLE [#tmp_origin] ([Names] VARCHAR(10))
CREATE TABLE [#tmp_destination] ([Names] VARCHAR(10))
CREATE UNIQUE INDEX [IX_UniqueName] ON [#tmp_destination]([Names] ASC)
INSERT INTO [#tmp_origin]([Names]) VALUES ('a')
INSERT INTO [#tmp_origin]([Names]) VALUES ('a')
INSERT INTO [#tmp_origin]([Names]) VALUES ('b')
INSERT INTO [#tmp_origin]([Names]) VALUES ('c')
INSERT INTO [#tmp_destination]([Names]) VALUES ('a')
INSERT INTO [#tmp_destination]([Names]) VALUES ('a_1')
INSERT INTO [#tmp_destination]([Names]) VALUES ('b')
-- create initial situation - End
DECLARE @Name VARCHAR(10)
DECLARE NamesCursor CURSOR LOCAL FORWARD_ONLY FAST_FORWARD READ_ONLY FOR
SELECT [Names]
FROM [#tmp_origin];
OPEN NamesCursor;
FETCH NEXT FROM NamesCursor INTO @Name;
WHILE @@FETCH_STATUS = 0
BEGIN
DECLARE @finalName VARCHAR(10)
SET @finalName = @Name
DECLARE @counter INT
SET @counter = 1
WHILE(1=1)
BEGIN
IF NOT EXISTS(SELECT * FROM [#tmp_destination] WHERE [Names] = @finalName)
BREAK;
SET @finalName = @Name + '_' + CAST(@counter AS VARCHAR)
SET @counter = @counter + 1
END
INSERT INTO [#tmp_destination] ([Names]) (
SELECT @finalName
)
FETCH NEXT FROM NamesCursor INTO @Name;
END
CLOSE NamesCursor;
DEALLOCATE NamesCursor;
SELECT *
FROM [#tmp_destination]
/*
Expected result:
a
a_1
a_2
a_3
b
b_1
c
*/
DROP TABLE [#tmp_origin]
DROP TABLE [#tmp_destination]
这工作正常,但当要插入的项目数增加时,其性能会大大降低。
有什么想法可以加快速度吗?
谢谢
使用窗口函数允许对重复项进行编号。您还可以从目标表中获取计数(需要 where 条件来去除您添加的后缀(:
select orig.names,
row_number() over (partition by orig.names order by orig.names) as rowNo,
dest.count
from ##tmp_origin orig
cross apply (select count(1) from #tmp_destination where names = orig.names) as dest
可以从上面构建insert
(如果大于零,则rowNo + dest.count -1
新后缀(。
建议您重构目标临时表,将名称和后缀作为单独的列包含在内 - 这可能意味着有一个新的中间阶段 - 因为这将使匹配逻辑更简单。
像这样:
insert [#tmp_destination]
select CASE WHEN row_number() over(partition by Names order by Names) > 1 THEN Names + '_' + CONVERT(VARCHAR(10), row_number() over(partition by Names order by Names)) ELSE Names END
from [#tmp_origin]
不会使用游标。相反,我会使用 ROW_NUMBER()
构建查询。这样,您可以在原始表中添加一个计数器,然后使用此计数器追加到您的 [名称]:
SELECT [Names], ROW_NUMBER() OVER (PARTITION BY [Names] ORDER BY [Names]) - 1 AS [counter]
INTO #tmp_origin_with_counter
FROM #tmp_origin
SELECT CONCAT([Names], IIF([counter] = 0, '', '_'+ CAST([counter] AS NVARCHAR)))
INTO #tmp_destination
FROM #tmp_origin_with_counter