尽可能接近 n 个字符,假设最后一个字符是空格或行的结尾/开头



我想在某个单词的左右提取大约 50 个字符,但为了确保最外面的字符不会被拆分,最后一个字符必须是空格、行首或行尾。我尝试过这样的事情但没有成功:

^.*(s{0,50}(word)s{0,50}).*$

这与"单词"匹配,但在前后突然结束。

例如,使用"...测试测试字测试测试...",它匹配"单词"。

通过使用s{0,50},您可以有效地尝试匹配 0-50 个空格。您可能希望将s更改为所需的字符(例如[a-zA-Zs.].以匹配所有字符)。

我的建议如下:

((b.{0,50})?(word)(.{0,50}b)?)

请注意,我必须创建两个新组并使它们成为可选组,以便匹配边界。您可能还希望在组内添加b以将您的word与其他分开,如下所示:

((b.{0,50}b)?(word)(b.{0,50}b)?)

您可以使用此正则表达式在某个单词的左侧和右侧提取最多 50 个字符:

(.{0,50}bwordb.{0,50})

在线演示:http://regex101.com/r/uV8pL6

解释:

1st Capturing group (.{0,50}bwordb.{0,50})
.{0,50} matches any character (except newline)
Quantifier: Between 0 and 50 #, as many times as possible, giving back as needed [greedy]
b assert position at a word boundary (^w|w$|Ww|wW)
word matches the characters word literally (case sensitive)
b assert position at a word boundary (^w|w$|Ww|wW)
.{0,50} matches any character (except newline)
Quantifier: Between 0 and 50 #, as many times as possible, giving back as needed [greedy]

最新更新