所以我试图做这样的事情(是的,包括换行符):
比赛# 1
START
START
stuff
STOP
more stuff
STOP
比赛# 2
START
START
stuff
STOP
more stuff
STOP
我已经走了这么远
参数为"g"m"i"one_answers"s"的START(.*?^(?:(?!STOP).)*$|(?R))|STOP
问题是,如果不匹配整个文本中的最后一个"STOP",我就无法匹配STOP
之后的任何内容。
下面是一个regex101示例
https://regex101.com/r/vD4nX6/1我希望得到一些指导
Thanks in advance
下面是一个与您的示例匹配的模式:
^h*STARTh*n(?:h*+(?!(?:START|STOP)h*$)[^n]*n|(?R)n)*h*STOPh*$
使用/mg
标志(live at https://regex101.com/r/iK9tK5/1)。
背后的想法:
^ # beginning of line
h* START h* n # "START" optionally surrounded by horizontal whitespace
# on a line of its own
(?: # between START/STOP, every line is either "normal"
# or a recursive START/STOP block
h*+ # a normal line starts with optional horizontal whitespace
(?! # ... not followed by ...
(?: START | STOP ) h* $ # "START" or "STOP" on their own
)
[^n]* n # any characters, then a newline
|
(?R) n # otherwise it's a recursive START/STOP block
)* # we can have as many items as we want between START/STOP
h* STOP h* # "STOP" optionally surrounded by horizontal whitespace
$ # end of line
我已经使h*+
占有,以避免意外匹配" STOP"
的h*
的0次迭代,而不是跟着"STOP"
(它们后面跟着" STOP"
(带空格))。+
强制h
尽可能多地匹配,因此它必须消耗空间。
或者您可以将h*
拉入forward: (?!h*(?:START|STOP)h*$)
这也可以工作,但然后向前看会跳过任何空格,看看它们是否跟着START/STOP,只是为了让[^n]*
外面再去那些相同的空间。在h*+
开始时,我们匹配这些空格一次,没有回溯。我猜这是微优化