我想把一个字符串分成长度不超过30行的组,但是,在分割时,最后一个单词不应该中断,如果它中断,将该单词移动到下一个行集。
下面是示例字符串:
DECLARE @FullAddress VARCHAR(MAX)= '216 Apartment123 AreaArea SampleWord1 Word2 MiddleTown Upper1Location Another5 NewYork'
所需输出:
+-----+------------------------------+
| Row | Result |
+-----+------------------------------+
| 1 | 216 Apartment123 AreaArea |
| 2 | SampleWord1 Word2 MiddleTown |
| 3 | Upper1Location Another5 |
| 4 | NewYork |
+-----+------------------------------+
我尝试使用substring,但它将单词分割为固定长度
DECLARE @FullAddress VARCHAR(MAX)= '216 Apartment123 AreaArea SampleWord1 Word2 MiddleTown UPLocation Downtownocation NewYork'
SELECT SUBSTRING(@FullAddress,1,30)
UNION ALL
SELECT SUBSTRING(@FullAddress,31,30)
UNION ALL
SELECT SUBSTRING(@FullAddress,61,30)
假设您有一个地址表,我将使用递归CTE。
在每次迭代中,找到最后一个可能的空格,然后针对空格后的字符开始下一次迭代。
- 取31个字符
- 查找第一个空格 的位置
对于拐角情况要格外小心:
- 要搜索的剩余字符串小于30个字符
- 当前被搜索的字符串前31个字符中没有空格
使用以下测试数据…
CREATE TABLE test (
address VARCHAR(MAX)
);
INSERT INTO
test
VALUES
('216 Apartment123 AreaArea SampleWord1 Word2 MiddleTown Upper1Location Another5 NewYork'),
('216 Apartment123 AreaArea SampleWord1 Word2 MiddleTownxx Upper1LocationUpper1LocationUpper1Location Another5 NewYork'),
('216 Apartment123 AreaArea SampleWord1 Word2 MiddleTownxx Upper1LocationUpper1LocationUpper1Location Another5 NewYork x')
;
使用以下CTE…
DECLARE @chars BIGINT = 30;
WITH
parts AS
(
SELECT
address,
LEN(address) AS length,
CAST(0 AS BIGINT) AS last_space,
CAST(1 AS BIGINT) AS next,
address AS fragment
FROM
test
UNION ALL
SELECT
parts.address,
parts.length,
last_space.pos,
parts.next + COALESCE(last_space.pos, @chars),
SUBSTRING(parts.address, parts.next, COALESCE(last_space.pos - 1, @chars))
FROM
parts
CROSS APPLY
(
SELECT
@chars + 2
-
NULLIF(
CHARINDEX(
' ',
REVERSE(
SUBSTRING(
parts.address + ' ',
parts.next,
@chars + 1
)
)
)
, 0
)
)
last_space(pos)
WHERE
parts.next <= parts.length
)
SELECT
*, len(fragment) AS chars
FROM
parts
WHERE
next > 1
ORDER BY
address,
next
https://dbfiddle.uk/?rdbms=sqlserver_2019&小提琴= acd11f2bc73e5036bd82498ecf14b08f
这是一种纯粹的SQL Server方法,但我建议使用SQL Server之外的技术,因为在将来维护代码时,您将拥有更有效的方法和更好的时间。不过,闲话少说,让我们来看一下,一步一步地解释每个概念:
<标题>第一步你可以创建一个临时表,它有一个id和一个内容字段。
<标题>第二步h1> 可以通过STRING_SPLIT拆分字符串,它将为您返回一个表。编辑:正如Larnu在评论部分指出的那样,不幸的是,SPLIT_STRING不能保证出现的顺序,参见string_split的结果顺序?
需要一个替代方案,如OPENJSON
。
- 创建临时表
- 拆分字符串
- 迭代拆分的结果
- 将最新记录的内容长度(最大id)加载到一个变量中,默认为0
- 如果你刚刚加载的值+当前字符串的长度+空格>最大长度
- 然后以您的值作为内容插入一个新记录
- else更新最后一条记录,通过将空格和当前字符串连接到它的前一个值
您可以使用递归CTE和xml路径来解决这个问题。为了更好的可读性,我将代码拆分为几个cte,但我猜您可能会将其精简为所需的cte。但是,结果应该符合您的预期结果:
DECLARE @FullAddress VARCHAR(MAX)= '216 Apartment123 AreaArea SampleWord1 Word2 MiddleTown Upper1Location Another5 NewYork'
DECLARE @MaxLen INT = 30;
WITH cte AS(
SELECT 1 rn, CHARINDEX(' ', @FullAddress) spacePos, LEFT(@FullAddress, CHARINDEX(' ', @FullAddress)) SplitLeft, RIGHT(@FullAddress, LEN(@FullAddress) - CHARINDEX(' ', @FullAddress)) SplitRight
UNION ALL
SELECT rn+1, CHARINDEX(' ', SplitRight), LEFT(SplitRight, CHARINDEX(' ', SplitRight)), RIGHT(SplitRight, LEN(SplitRight) - CHARINDEX(' ', SplitRight))
FROM cte
WHERE CHARINDEX(' ', SplitRight) > 0
UNION ALL
SELECT rn+1, LEN(SplitRight), SplitRight, NULL
FROM cte
WHERE CHARINDEX(' ', SplitRight) = 0
),
cte2 AS(
SELECT *, SUM(spacePos) OVER (ORDER BY rn ROWS UNBOUNDED PRECEDING) s, SUM(spacePos) OVER (ORDER BY rn ROWS UNBOUNDED PRECEDING)/@MaxLen s1
FROM cte
),
cte3 AS(
SELECT s1 + (SUM(spacePos) OVER (PARTITION BY s1 ORDER BY rn ROWS UNBOUNDED PRECEDING)/@MaxLen) s, SplitLeft, rn
FROM cte2
)
SELECT DISTINCT s, (SELECT c2.SplitLeft AS [text()]
FROM cte3 c2
WHERE c2.s = c1.s
ORDER BY rn
FOR XML PATH ('')
) AS AdressLine
FROM cte3 c1
ORDER BY s
您可以使用OPENJOSN()
来维护排序,并从临时表中读取不同的值并插入到最终表中,如下所示:
declare @FullAddress VARCHAR(MAX)= '216 Apartment123 AreaArea SampleWord1 Word2 MiddleTown Upper1Location Another5 NewYork'
drop table if exists #temp
select convert(int, [key]) [key], [value] strng into #temp
from openjson(concat('["', replace(@FullAddress, ' ', '","'), '"]'))
order by convert(int, [key])
declare @str varchar(100)='', @newstr varchar(30)='', @key int = 0
declare @final_table table (final_strings varchar(30))
while ((select count(1) from #temp) > 0)
begin
set @newstr = (select top 1 isnull(strng, '') from #temp order by [key])
if len(@str + ' ' + @newstr) > 30 and @str <> ''
begin
insert into @final_table select ltrim(@str)
set @str = @newstr
end
else
set @str = @str + ' ' + @newstr
delete #temp where [key] = @key
set @key = @key + 1
end
if @str <> ''
insert into @final_table select @str
select final_strings, len(final_strings) string_lenght from @final_table
请参阅数据库<>提琴这里.
这里我使用一个SPLIT函数(UDF)来分割字符串。您可以使用任何可用的拆分函数。
SPLIT函数:-
CREATE OR ALTER FUNCTION [dbo].[Split]
(
@RowData NVARCHAR(MAX),
@Delimeter NVARCHAR(MAX)
)
RETURNS @RtnValue TABLE
(
ID INT IDENTITY(1,1),
Data NVARCHAR(MAX)
)
AS
BEGIN
Declare @xml xml
SET @RowData=Replace(Replace(@RowData,'&','&'),'<','<')
SET @xml = N'<root><r>' + Replace(@RowData, @Delimeter,'</r><r>') + '</r></root>'
Insert into @RtnValue(Data)
SELECT t.value('.', 'nvarchar(max)') AS [Value] FROM @XML.nodes('//root/r') AS a(t)
RETURN
END
你的答案
DECLARE @FullAddress VARCHAR(MAX) = '216 Apartment123 AreaArea SampleWord1 Word2 MiddleTown Upper1Location Another5 NewYork Zakbrown'
SELECT id
,Data
FROM SPLIT(@FullAddress, ' ')
DECLARE @table TABLE (
id INT
,datas VARCHAR(50)
)
DECLARE @data VARCHAR(50)
DECLARE @ID INT
DECLARE @datas VARCHAR(50)
DECLARE @i INT = 1
DECLARE db_cursor CURSOR
FOR
SELECT id
,Data
FROM SPLIT(@FullAddress, ' ')
OPEN db_cursor
FETCH NEXT
FROM db_cursor
INTO @id
,@data
SET @datas = ''
WHILE @@FETCH_STATUS = 0
BEGIN
--select @data,Len(@datas)
IF Len(@datas + ' ' + @data) <= 30
BEGIN
SET @datas = @datas + ' ' + @data
END
ELSE
BEGIN
INSERT INTO @table (
id
,[datas]
)
VALUES (
@i
,@datas
)
SET @i = @i + 1
SET @datas = @data
END
FETCH NEXT
FROM db_cursor
INTO @id
,@data
END
CLOSE db_cursor
DEALLOCATE db_cursor
SELECT datas
FROM @table
UNION ALL
SELECT @datas
OUTPUT
216 Apartment123 AreaArea
SampleWord1 Word2 MiddleTown
Upper1Location Another5
NewYork Zakbrown
我创建了一个函数来做这个:
CREATE FUNCTION [dbo].[fnSplitSentence]
(
@fullText varchar(max),
@maxLength int
)
RETURNS @sentences table (
sentence varchar(max)
)
AS
BEGIN
declare @words table (
word varchar(max),
length int
)
SET @fullText = TRIM(@fullText) + ' '
declare @word varchar(max) = '',
@length int = 0,
@char varchar(1),
@i int = 1,
@len int = len(@fullText) + 1
while @i <= @len
begin
set @char = SUBSTRING(@fullText, @i, 1)
if @char = ' '
begin
insert into @words (word, length) values (@word, @length)
set @word = ''
set @length = 0
end
else
begin
set @word += @char
set @length += 1
end
set @i += 1
end
declare @sentence varchar(max) = ''
DECLARE CUR CURSOR FOR SELECT word, length FROM @words
open CUR
FETCH NEXT FROM CUR INTO @word, @length
WHILE @@FETCH_STATUS = 0
begin
--print(@word)
if LEN(@sentence) + LEN(@word) + 1 > @maxLength
begin
insert into @sentences (sentence) values (@sentence)
set @sentence = @word + ' '
end
else
set @sentence += @word + ' '
FETCH NEXT FROM CUR INTO @word, @length
end
insert into @sentences (sentence) values (@sentence)
CLOSE CUR
DEALLOCATE CUR
RETURN;
END
用法:
select * from fnSplitSentence('Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore', 12)