量词a?
应该与a single or no occurrence of a
匹配。给定的程序使用java.util.regex
包将正则表达式与字符串相匹配。
我的问题是关于模式匹配的程序/结果的输出:
程序输出:-
Enter your regex: a?
Enter input string to search: a
I found the text "a" starting at index 0 and ending at index 1.
I found the text "" starting at index 1 and ending at index 1.
问题:-
它应该匹配a的一次或零次出现。那么它不应该匹配zero-length ""
(即a
的不存在/零次出现(starting and ending at index 0
,然后匹配a
starting at index 0
和ending at index 0
,然后是""
starting and ending at index 1
吗我认为应该。
这样一来,matcher
似乎一直在字符串外寻找a
,然后当它确定不再有a
(即字符串的end
?(时,它就会寻找zero occurrence
/不存在a
?我认为这会很乏味,但事实并非如此。但是,在它匹配从index 0
开始到index 1
结束的一个之前,它应该找到一个"starting and ending at 0
?
程序:-
import java.io.InputStreamReader;
import java.util.Scanner;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
/*
* Enter your regex: foo
* Enter input string to search: foo
* I found the text foo starting at index 0 and ending at index 3.
* */
public class RegexTestHarness {
public static void main(String[] args){
/*Console console = System.console();
if (console == null) {
System.err.println("No console.");
System.exit(1);
}*/
while (true) {
/*Pattern pattern =
Pattern.compile(console.readLine("%nEnter your regex: ", null));*/
System.out.print("nEnter your regex: ");
Scanner scanner = new Scanner(new InputStreamReader(System.in));
Pattern pattern = Pattern.compile(scanner.next());
System.out.print("nEnter your input string to seacrh: ");
Matcher matcher =
pattern.matcher(scanner.next());
System.out.println();
boolean found = false;
while (matcher.find()) {
/*console.format("I found the text" +
" "%s" starting at " +
"index %d and ending at index %d.%n",
matcher.group(),
matcher.start(),
matcher.end());*/
System.out.println("I found the text "" + matcher.group() + "" starting at index " + matcher.start() + " and ending at index " + matcher.end() + ".");
found = true;
}
if(!found){
//console.format("No match found.%n", null);
System.out.println("No match found.");
}
}
}
}
?
量词是贪婪,这意味着它将试图找到最大可能匹配。由于匹配的部分不能重复使用,您不能在a
之前匹配空字符串""
(您可以将其视为第一次贪婪匹配(,但您可以在它之后匹配空字符串。
您可以通过在该量词后面添加?
使其不情愿,这将使其尝试找到尽可能小的匹配。所以,如果您试图找到正则表达式a??
的匹配项,您将看到0
作为第一个匹配项的索引(在a
之前为空字符串(。