我有两列应该包含相同的文本——有时当内容很大时,很难找到差异的实际位置。
这并不完美,但如果有一个函数可以接受两列的值并返回第一个匹配发生的位置,那将非常有帮助。由于这将被称为select中的表单,因此性能将相当重要,但它只会偶尔运行,因此不是一个大问题。
或者,一个可以在源代码管理实用程序中执行类似DIFF的功能将是理想的,但我无法想象这会有多复杂。
您可以对该解决方案进行大量改进,但部分来自以下逻辑:
declare @i as int
While @i < LEN(String1)
BEGIN
SELECT @i AS "index"
FROM YourTable
WHERE SUBSTRING(String1, @i, 1) != SUBSTRING(String2, @i, 1)
set @i = @i + 1
END
要查找两列是否不同,可以使用BINARY_CHECKSUM()。这足够敏感,甚至可以区分大写和小写。
至于查找两个字符串之间的差异,我会使用2个cte,然后使用EXCEPT。第一个结果将是字符串的第一个偏差。
declare @test1 VARCHAR(1000) = 'I have two columns that should contain identical text - sometimes when the contents are large it is very difficult to find where the discrepancy is actually located.'
declare @test2 VARCHAR(1000) = 'I have two columns that should contain identical text - sometimes when the contents are Large it is very difficult to find where the discrepancy is actually located.'
SELECT CASE WHEN BINARY_CHECKSUM(@test1) = BINARY_CHECKSUM(@test2) THEN 'Identical' ELSE 'Not Identical' END AS check_
;with cte1 AS (
SELECT
1 AS col_
UNION ALL
SELECT col_ + 1
FROM cte1
WHERE col_ < (SELECT LEN(@test1))
),
cte2 AS (
SELECT
1 AS col_
UNION ALL
SELECT col_ + 1
FROM cte2
WHERE col_ < (SELECT LEN(@test2))
), final_cte AS (
SELECT col_, SUBSTRING(@test1, col_, 1) AS char_, BINARY_CHECKSUM(SUBSTRING(@test1, col_, 1)) AS char_checksum
FROM CTE1
EXCEPT
SELECT col_, SUBSTRING(@test2, col_, 1) AS char_, BINARY_CHECKSUM(SUBSTRING(@test2, col_, 1)) AS char_checksum
FROM CTE2
)
select col_, char_
FROM final_cte
ORDER by col_
OPTION (MAXRECURSION 1000)
尝试一下:
CREATE FUNCTION dbo.fcn_DiffPosition
(
@str1 nvarchar(max),
@str2 nvarchar(max)
)
RETURNS INT
AS
BEGIN
DECLARE @MinPosition int = NULL
;WITH cte1 AS
(
SELECT 1 AS CharacterPosition,
SUBSTRING(@str1, 1, 1) AS [Character]
UNION ALL
SELECT CharacterPosition + 1,
SUBSTRING(@str1, CharacterPosition + 1, 1)
FROM cte1
WHERE CharacterPosition < LEN(@str1)
),
cte2 AS
(
SELECT 1 AS CharacterPosition,
SUBSTRING(@str2, 1, 1) AS [Character]
UNION ALL
SELECT CharacterPosition + 1,
SUBSTRING(@str2, CharacterPosition + 1, 1)
FROM cte2
WHERE CharacterPosition < LEN(@str2)
)
SELECT @MinPosition = MIN(ISNULL(cte1.CharacterPosition, cte2.CharacterPosition))
FROM cte1
FULL JOIN cte2 ON cte1.CharacterPosition = cte2.CharacterPosition
WHERE ISNULL(cte1.[Character], '') != ISNULL(cte2.[Character], '')
OPTION (MAXRECURSION 0)
RETURN @MinPosition
END
如果两个字符串相同,函数将返回null。根据您的排序规则,这可能区分大小写。示例:
SELECT dbo.fcn_DiffPosition('Hello World', 'HelloWorld') -- 6
SELECT dbo.fcn_DiffPosition('Hello', 'HelloWorld') -- 6
SELECT dbo.fcn_DiffPosition('Dog', 'Dog') -- NULL