文本:
One sentence here, much wow. Another one here. This is O.N.E. example n. 1, a nice one to understand. Hope it's clear now!
Regex:(?<=.s)[A-Z].+?nice one.+?.(?=s[A-Z])
结果:Another one here. This is O.N.E. example n. 1, a nice one to understand.
如何获取This is O.N.E. example among n. 1, a nice one to understand.
?(即匹配正则表达式的尽可能小的句子(
只需在表达式前面插入一个贪婪的.*
.*.s([A-Z].+?nice one.+?.(?=s[A-Z]))
这里有一种不同的方法,只需拆分整个文本,然后过滤掉您想要的内容:
import re
s = "One sentence here, much wow. Another one here. This is O.N.E. example n. 1, a nice one to understand. Hope it's clear now!"
result = [x for x in re.split(r'(?<=B..)s*',s) if 'nice one' in x][0]
print(result) # This is O.N.E. example n. 1, a nice one to understand.
不知道你有多少边缘案例,但在这里我使用了具有以下模式的re.split()
:(?<=B..)s*
。这意味着:
(?<=B..)
-断言位置的正向回溯位于b
(单词边界(不适用的位置之后,后面跟着一个文字点s*
-0+空白字符
使用生成的数组,检查哪个元素包含您想要的单词"不会有太大问题;不错的一个";。
查看在线演示
您可以排除匹配一个点,只匹配大写字符后面跟着一个点的点,或者一个点后面跟着空格和数字的点。
(?:(?<=.s)|^)[A-Z][^.A-Z]*(?:(?:[A-Z].|.sd)[^.A-Z]*)*bnice oneb.+?(?=s[A-Z])
(?:(?<=.s)|^)
在字符串的左侧或开头断言一个.
和空白字符[A-Z][^.A-Z]*
匹配一个大写字符A-Z和除点或大写字符以外的任何字符的0+倍(?:
非捕获组(?:[A-Z].|.sd)
匹配A-Z和.
或匹配.
空白字符和数字[^.A-Z]*
(可选(匹配除.
或大写字符之外的任何字符
)*
关闭组并可选择重复bnice oneb.+?(?=s[A-Z])
匹配nice one
并匹配,直到向右断言whitspace字符和大写字符
Regex演示