我正在尝试查找第n个匹配项,或者如果匹配项少于n,则查找最后一个匹配项。n是在我的程序中确定的,regex字符串是用整数替换"n"来构造的。
这是我的最佳猜测,但我的重复运算符{1,n}总是只匹配一次。我以为默认情况下会很贪婪
The basic regex would be:
distinctiveString[sS]*?value="([^"]*)"
So I modified it to this to try to get the nth one instead
(?:distinctiveString[sS]*?){1,n}value="([^"]*)"
distinctiveString randomStuff value="val1"
moreRandomStuff
distinctiveString randomStuff value="val2"
moreRandomStuff
distinctiveString randomStuff value="val3"
moreRandomStuff
distinctiveString randomStuff value="val4"
moreRandomStuff
distinctiveString randomStuff value="val5"
所以在这种情况下,我想要的是,当n=2时,我会得到"val2",n=5时,我得到"val5",n=8时,我也会得到"val 5"。
我正在通过应用程序层传递我的正则表达式,但我认为它是直接交给Perl的。
试试这样的东西:
(?:(?:[sS]*?distinctiveString){4}[sS]*?|(?:[sS]*distinctiveString)[sS]*?)value="([^"]*)"
其将具有匹配组1中的CCD_ 1或用于输入的"val3"
:
distinctiveString randomStuff value="val1"
moreRandomStuff
distinctiveString randomStuff value="val2"
moreRandomStuff
distinctiveString randomStuff value="val3"
模式的快速分解:
(?: #
(?:[sS]*?distinctiveString){4}[sS]*? # match 4 'distinctiveString's
| # OR
(?:[sS]*distinctiveString)[sS]*? # match the last 'distinctiveString'
) #
value="([^"]*)" #
通过查看您的个人资料,您似乎在Java标签中最活跃,因此这里有一个小的Java演示:
import java.util.regex.*;
public class Main {
private static String getNthMatch(int n, String text, String distinctive) {
String regex = String.format(
"(?xs) # enable comments and dot-all n" +
"(?: # start non-capturing group 1 n" +
" (?:.*?%s){%d} # match n 'distinctive' strings n" +
" | # OR n" +
" (?:.*%s) # match the last 'distinctive' string n" +
") # end non-capturing group 1 n" +
".*?value="([^"]*)" # match the value n",
distinctive, n, distinctive
);
Matcher m = Pattern.compile(regex).matcher(text);
return m.find() ? m.group(1) : null;
}
public static void main(String[] args) throws Exception {
String text = "distinctiveString randomStuff value="val1" n" +
"moreRandomStuff n" +
"distinctiveString randomStuff value="val2" n" +
"moreRandomStuff n" +
"distinctiveString randomStuff value="val3" n" +
"moreRandomStuff n" +
"distinctiveString randomStuff value="val4" n" +
"moreRandomStuff n" +
"distinctiveString randomStuff value="val5" ";
String distinctive = "distinctiveString";
System.out.println(getNthMatch(4, text, distinctive));
System.out.println(getNthMatch(5, text, distinctive));
System.out.println(getNthMatch(6, text, distinctive));
System.out.println(getNthMatch(7, text, distinctive));
}
}
它将在控制台上打印以下内容:
val4val5val5val5
注意,当启用点全部选项((?s)
)时,.
与[sS]
匹配。
编辑
是的,{1,n}
是贪婪的。然而,当您在(?:distinctiveString[sS]*?){1,3}
中的distinctiveString
之后放置[sS]*?
时,则"val4"
0匹配,然后不情愿地零个或多个字符(因此零将匹配),然后重复1到3次。您要做的是在distinctiveString
:之前移动[sS]*?
import java.util.regex.*;
public class Main {
private static String getNthMatch(int n, String text, String distinctive) {
String regex = String.format(
"(?:[\s\S]*?%s){1,%d}[\s\S]*?value="([^"]*)"",
distinctive, n
);
Matcher m = Pattern.compile(regex).matcher(text);
return m.find() ? m.group(1) : null;
}
public static void main(String[] args) throws Exception {
String text = "distinctiveString randomStuff value="val1" n" +
"moreRandomStuff n" +
"distinctiveString randomStuff value="val2" n" +
"moreRandomStuff n" +
"distinctiveString randomStuff value="val3" n" +
"moreRandomStuff n" +
"distinctiveString randomStuff value="val4" n" +
"moreRandomStuff n" +
"distinctiveString randomStuff value="val5" ";
String distinctive = "distinctiveString";
System.out.println(getNthMatch(4, text, distinctive));
System.out.println(getNthMatch(5, text, distinctive));
System.out.println(getNthMatch(6, text, distinctive));
System.out.println(getNthMatch(7, text, distinctive));
}
}
它还打印:
val4val5val5val5