我如何删除字符串中所有不需要的空白，但保持符号像'n'?

我有这样一个字符串:

s = 'Hello   nWorld!nToday is a wonderful day'

我需要得到这个:

'Hello nWorld!nToday is a wonderful day'

我尝试使用split和join，如:

' '.join('Hello   nWorld!nToday is a wonderful day'.split())

但是我得到了这个:

'Hello World!Today is a wonderful day'

正则表达式，如:

re.sub(r"s+", " ", 'Hello   nWorld!nToday is a wonderful day')

给出相同的结果。

你可以做几件事。

你可以简单地用一个空格替换任何至少出现一个空格的地方:

re.sub(r'( )+', ' ', s)

要覆盖更多类型的(水平)空白，您可以包括制表符(t)和换行符(f)字符(参见regex101):

re.sub(r'[tf ]+', ' ', s)

或者，除了指定要替换的字符外，您可以排除不想要替换的字符(双重否定!):

re.sub(r'[^Snr]+', ' ', s)

在最后一个例子中，^表示列表中不的任何字符都应该匹配，S表示所有非-空格字符，n和r是换行符和回车符。看到regex101。

使用str的方法，您可能会得到如下所需的输出

s1= 'Hello   nWorld!nToday is a wonderful day'
' '.join(i for i in 'Hello   nWorld!nToday is a wonderful day'.split(' ') if i)

为

'Hello nWorld!nToday is a wonderful day'

解释:分割空格字符，然后使用推导式过滤掉空字符串(那些确实来自相邻空格)，然后加入剩下的

对于问题的两种解释，有两种方法可以做到这一点。

第一种解释:除换行符(n)外，如果一行中有两个或两个以上相同的空白字符，则删除除一个字符外的所有空白字符。

替换正则表达式

的每个匹配项

([ trfv])1*(?=1)

带有空字符串。

演示这个正则表达式有以下元素:

(               Begin capture group 1
[ trfv]   Match a whitespace other than a newline (`n`)
)               End capture group 1
1*             Match the character in character class 1 zero or more times 
(?=1)          Positive lookahead asserts that the next character matches
the content of character class 1

或者，替换

的每个匹配项

([ trfv])1+

表示捕获组的内容1.

演示这个正则表达式有以下元素:

(              Begin capture group 1
[ trfv]  Match a whitespace character other than n
)              End capture group 1
1+            Match the content of capture group 1 one or more times

第二种解释:除换行符(s)外，如果一行中有两个或两个以上的空白字符，则删除除最后一个空白字符外的所有空白字符。

替换正则表达式

的每个匹配项

[ trfv](?=[ trfv])

带有空字符串。

演示这个正则表达式有以下元素:

[ trfv]+    Match one or more whitespace characters other than `n`
(?=             Begin a positive lookahead
[ trfv]   Match a whitespace character other than `n`
)               End positive lookahead

或者，替换

的每个匹配项

[ trfv]{2,}

演示该正则表达式读取，"匹配除换行符(n)以外的空白字符两次或更多次，尽可能多。

相关内容

最新更新

热门标签：