我有这样的字符串:hey what is up!
,"what did you say?"
,"he said 'well'"
等,以及这样的正则表达式:[!%&'()$#"/\*+,-.:;<=>?@[]^_´
{|}~]´。这些是我的分隔符,在显示的字符串中插入一个空格,如下所示:"hey what is up !"
,"what did you say ?"
,"he said ' well '"
。因此,如果其中一个分隔符在另一个字符序列的前面,则添加一个空格,如果它在后面,也添加一个空格。
我怎样才能做到这一点?我不想用这些分隔符分隔。
这是我的解决方案,但我很好奇如何解决它与regex。
space = set("[!%&'()$#"/*+,-.:;<=>?@[]^_´`{|}~]")
for sent in self.sentences:
sent = list(sent)
for i, char in enumerate(sent):
# Make sure to respect length of string when indexing
if i != 0:
# insert space in front if char is punctuation
if sent[i] in space and sent[i - 1] != " ":
sent.insert(i, " ")
if i != len(sent)-1:
# insert space after if char is punctuation
if sent[i] in space and sent[i + 1] != " ":
sent.insert(i + 1, " ")
您可以扩展您的模式来捕获可选的空格,然后用capture group +空格替换前后(循环仅用于演示,不是必需的):
import re
strings = ["hey what is up!", "what did you say?", "he said 'well'"]
pattern = r'(s?[!%&'()$#"/\*+,-.:;<=>?@[]^_´{|}~]s?)'
for string in strings:
print(re.sub(pattern, r' 1 ', string))
输出如下:
hey what is up !
what did you say ?
he said ' well '
如果没有re模块的帮助,您可以简单地这样做:
punctuation = "!%&'()$#"/\*+,-.:;<=>?@[]^_´{|}~"
mystring = "Well hello! How are you?"
mylist = list(mystring)
i = 0
for c in mystring:
if c in punctuation:
mylist.insert(i, ' ')
i += 2
else:
i += 1
print(''.join(mylist))
您可以创建一个遍历字符串的循环,当它找到标点符号字符时,使用slice函数将字符串切成两半,并在两者之间使用空格进行连接。例如:
for i in yourString:
if yourString[i] == '!':
newString = yourString.slice(0, i) + " " + yourString.slice(i + 1)
它只检查"!"但是你可以用标点符号
的字典来替换它