将分隔字符串拆分为包含多行和多列的表



请您提供帮助,我是SQL新手,并且面临以下情况。我使用谷歌并试图找到解决方案,但失败了。

我有一个名为 TEMP 的临时表,其中有一列名为结果和行,具体取决于 csv 字符串的长度。当您选择 * FROM #TEMP(临时表)时,它将返回如下数据:

结果

88.47,1,263759,10.00|303.53,2,264051,13.00|147.92,3,264052,6.00|43.26,4,268394,10.00| 127.7,5,269229,4.00|

请使用下面的链接直接从数据库中查看结果:
http://design.northdurban.com/DatabaseResult.png

我需要一个解决方案,从现有的临时表中读取此数据,并将其插入到另一个带有行和列的临时表中,例如下面的链接:

所需的输出显示在下面的链接中

http://design.northdurban.com/capture.png

请您提供帮助,因为我相信这篇文章将帮助许多其他用户,因为我还没有找到任何现有的解决方案。

首先使用半升|将字符串转换为行

DECLARE @str VARCHAR(max)='88.47,1,263759,10.00| 303.53,2,264051,13.00| 147.92,3,264052,6.00| 43.26,4,268394,10.00| 127.7,5,269229,4.00|'
SELECT Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)')))
FROM   (SELECT Cast ('<M>' + Replace(@str, '|', '</M><M>') + '</M>' AS XML) AS Data) AS A
       CROSS APPLY Data.nodes ('/M') AS Split(a) 

然后使用parsename技巧将结果转换为其他列

SELECT Id,c1,c2,c3
FROM  (SELECT Id=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 4), ';', '.'),
              C1=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 3), ';', '.'),
              c2=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 2), ';', '.'),
              c3=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 1), ';', '.')
       FROM   (SELECT Cast ('<M>' + Replace(@str, '|', '</M><M>') + '</M>' AS XML) AS Data) AS A
              CROSS APPLY Data.nodes ('/M') AS Split(a)) a
WHERE  id IS NOT NULL 

SQLFIDDLE 演示

更新:要获得更好的性能,请尝试此操作。

SELECT c1,c2,c3,c4
FROM   (SELECT C1=Replace(Parsename(Replace(Replace(col, '.', ';'), ',', '.'), 4), ';', '.'),
               C2=Replace(Parsename(Replace(Replace(col, '.', ';'), ',', '.'), 3), ';', '.'),
               C3=Replace(Parsename(Replace(Replace(col, '.', ';'), ',', '.'), 2), ';', '.'),
               C4=Replace(Parsename(Replace(Replace(col, '.', ';'), ',', '.'), 1), ';', '.')
        FROM   (SELECT Split.a.value('.', 'VARCHAR(100)') col
                FROM   (SELECT Cast ('<M>' + Replace(@str, '|', '</M><M>') + '</M>' AS XML) AS Data) AS A
                       CROSS APPLY Data.nodes ('/M') AS Split(a))v) a
WHERE  c1 IS NOT NULL; 

Update2:要解析表中的多行,请使用此代码。

包含数据的示例表

create table #test(string varchar(8000))
insert into #test values
('88.47,1,263759,10.00| 303.53,2,264051,13.00| 147.92,3,264052,6.00| 43.26,4,268394,10.00| 127.7,5,269229,4.00|'),
('88.47,1,263759,10.00| 303.53,2,264051,13.00| 147.92,3,264052,6.00| 43.26,4,268394,10.00| 127.7,5,269229,4.00|')

查询

SELECT c1,c2,c3,c4
FROM   (SELECT C1=Replace(Parsename(Replace(Replace(col, '.', ';'), ',', '.'), 4), ';', '.'),
               C2=Replace(Parsename(Replace(Replace(col, '.', ';'), ',', '.'), 3), ';', '.'),
               C3=Replace(Parsename(Replace(Replace(col, '.', ';'), ',', '.'), 2), ';', '.'),
               C4=Replace(Parsename(Replace(Replace(col, '.', ';'), ',', '.'), 1), ';', '.')
        FROM   (SELECT Split.a.value('.', 'VARCHAR(100)') col
                FROM   (SELECT Cast ('<M>' + Replace(string, '|', '</M><M>') + '</M>' AS XML)
                 AS Data from #test) AS A
                       CROSS APPLY Data.nodes ('/M') AS Split(a))v) a
WHERE  c1 IS NOT NULL; 

这仅在您有 4 列时才有效。在这种情况下,您可以执行以下操作

SELECT REPLACE(PARSENAME(REPLACE(REPLACE(ColumnName, '.', '~'), ',', '.'), 4), '~', '.'),
 REPLACE(PARSENAME(REPLACE(REPLACE(ColumnName, '.', '~'), ',', '.'), 3), '~', '.'),
 REPLACE(PARSENAME(REPLACE(REPLACE(ColumnName, '.', '~'), ',', '.'), 2), '~', '.'),
 REPLACE(PARSENAME(REPLACE(REPLACE(ColumnName, '.', '~'), ',', '.'), 1), '~', '.')
From #TEMP

您可以编写一个表值函数来解析字符串,如下所示:

CREATE FUNCTION dbo.parseData ( @stringToSplit VARCHAR(MAX) )
RETURNS
    @return TABLE (ID int, Column1 real, Column2 int, Column3 int, Column4 real)
AS
BEGIN
    DECLARE @char char;
    DECLARE @len int = LEN(@stringToSplit);    
    DECLARE @buffer varchar(50) = '';
    DECLARE @field int = 1;
    DECLARE @Column1 real
    DECLARE @Column2 int
    DECLARE @Column3 int
    DECLARE @Column4 real
    DECLARE @row int = 1
    DECLARE @i int = 1;
    WHILE @i <= @len BEGIN
        SELECT @char = SUBSTRING(@stringToSplit, @i, 1)
        IF @char = ','
        BEGIN
            IF @field = 1
                SET @Column1 = CONVERT(real, @buffer);
            ELSE IF @field = 2
                SET @Column2 = CONVERT(int, @buffer);
            ELSE IF @field = 3
                SET @Column3 = CONVERT(int, @buffer);    
            SET @buffer = '';
            SET @field = @field + 1
        END
        ELSE IF @char = '|'
        BEGIN
            SET @Column4 = CONVERT(real, @buffer);
            INSERT INTO @return (ID, Column1, Column2, Column3, Column4)
            VALUES (@row, @Column1, @Column2, @Column3, @Column4);
            SET @buffer = '';
            SET @row = @row + 1
            SET @field = 1
        END
        ELSE
        BEGIN
            SET @buffer = @buffer + @char
        END
        SET @i = @i + 1;
    END
    RETURN
END
GO

然后像这样调用该函数:

SELECT Col1 = '88.47,1,263759,10.00| 303.53,2,264051,13.00| 147.92,3,264052,6.00| 43.26,4,268394,10.00| 127.7,5,269229,4.00|'
INTO #Temp1;
INSERT INTO #Temp1
VALUES ('88.47,1,263759,10.00| 303.53,2,264051,13.00| 147.92,3,264052,6.00| 43.26,4,268394,10.00| 127.7,5,269229,4.00|')
SELECT data.*
INTO #Temp2
FROM #Temp1 CROSS APPLY parseData(#Temp1.Col1) as data
SELECT *
FROM #Temp2
DROP TABLE #Temp1
DROP TABLE #Temp2

性能:

因此,我针对 NoDisplayName 描述的技术对这种技术进行了性能测试。超过10,000次迭代,我的技术花了13,826次,NoDisplayName花了36,176次,所以我只需要NoDisplayName的38%。

为了测试这一点,我使用了 Azure 数据库并运行了以下脚本。

-- First two queries to check the results are the same.
-- Note the Parsename technique returns strings rather than reals which is why
-- the last column has .00 at the end of the numbers in the Parsename tecnique.
DECLARE @str VARCHAR(max)='88.47,1,263759,10.00| 303.53,2,264051,13.00| 147.92,3,264052,6.00| 43.26,4,268394,10.00| 127.7,5,269229,4.01|'
SELECT c1,c2,c3, c4
    FROM  (SELECT C1=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 4), ';', '.'),
                  C2=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 3), ';', '.'),
                  C3=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 2), ';', '.'),
                  C4=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 1), ';', '.')
           FROM   (SELECT Cast ('<M>' + Replace(@str, '|', '</M><M>') + '</M>' AS XML) AS Data) AS A
                  CROSS APPLY Data.nodes ('/M') AS Split(a)) a
    WHERE  c1 IS NOT NULL;
SELECT *
FROM dbo.parseData(@str)
GO
-- Now lets time the Parsename method over 10,000 itterations
SET NOCOUNT ON;
DECLARE @str VARCHAR(max)='88.47,1,263759,10.00| 303.53,2,264051,13.00| 147.92,3,264052,6.00| 43.26,4,268394,10.00| 127.7,5,269229,4.00|'
DECLARE @i int = 0
declare @table table (c1 decimal, c2 int, c3 int, c4 decimal)
DECLARE @Start datetime = GETDATE();
while @i < 1000
begin
    INSERT INTO @table
    SELECT c1,c2,c3, c4
    FROM  (SELECT C1=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 4), ';', '.'),
                  C2=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 3), ';', '.'),
                  C3=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 2), ';', '.'),
                  C4=Replace(Parsename(Replace(Replace(Rtrim(Ltrim(Split.a.value('.', 'VARCHAR(100)'))), '.', ';'), ',', '.'), 1), ';', '.')
           FROM   (SELECT Cast ('<M>' + Replace(@str, '|', '</M><M>') + '</M>' AS XML) AS Data) AS A
                  CROSS APPLY Data.nodes ('/M') AS Split(a)) a
    WHERE  c1 IS NOT NULL;
    DELETE FROM @table;
    set @i = @i + 1;
end
DECLARE @End datetime = GETDATE()
PRINT CONVERT(nvarchar(50),@Start,126) + ' - ' + convert(nvarchar(50),@End,126) + ' - ' + convert(nvarchar(50), DATEDIFF(ms, @start, @end))
GO
-- Now the my technique over 10,000 itterations
SET NOCOUNT ON;
DECLARE @str VARCHAR(max)='88.47,1,263759,10.00| 303.53,2,264051,13.00| 147.92,3,264052,6.00| 43.26,4,268394,10.00| 127.7,5,269229,4.00|'
DECLARE @i int = 0
declare @table table (c1 decimal, c2 int, c3 int, c4 decimal)
DECLARE @Start datetime = GETDATE();
while @i < 1000
begin
    INSERT INTO @table
    SELECT *
    FROM dbo.parseData(@str)
    DELETE FROM @table;
    set @i = @i + 1;
end
DECLARE @End datetime = GETDATE()
PRINT CONVERT(nvarchar(50),@Start,126) + ' - ' + convert(nvarchar(50),@End,126) + ' - ' + convert(nvarchar(50), DATEDIFF(ms, @start, @end))
GO

相关内容

最新更新