在cygwin中,这不会返回匹配:
$ echo "aaab" | grep '^[ab]+$'
但这确实返回一个匹配:
$ echo "aaab" | grep '^[ab][ab]*$'
aaab
两个表达式不相同吗?是否有任何方法来表达"字符类的一个或多个字符",而不键入字符类两次(如在秒的例子)?
根据这个链接,这两个表达式应该是相同的,但也许Regular-Expressions.info没有覆盖cygwin中的bash。
grep
具有多个匹配的"模式",并且默认情况下仅使用基本集,该集不识别许多元字符,除非它们被转义。您可以将grep置于扩展模式或perl模式,以便对+
求值。
From man grep
:
Matcher Selection
-E, --extended-regexp
Interpret PATTERN as an extended regular expression (ERE, see below). (-E is specified by POSIX.)
-P, --perl-regexp
Interpret PATTERN as a Perl regular expression. This is highly experimental and grep -P may warn of unimplemented features.
Basic vs Extended Regular Expressions
In basic regular expressions the meta-characters ?, +, {, |, (, and ) lose their special meaning; instead use the backslashed versions ?, +, {, |, (, and ).
Traditional egrep did not support the { meta-character, and some egrep implementations support { instead, so portable scripts should avoid { in grep -E patterns and should use [{] to match a literal {.
GNU grep -E attempts to support traditional usage by assuming that { is not special if it would be the start of an invalid interval specification. For example, the command grep -E '{1' searches for the two-character string {1 instead of reporting a syntax
error in the regular expression. POSIX.2 allows this behavior as an extension, but portable scripts should avoid it.
或者,您可以使用egrep
代替grep -E
。
基本正则表达式中的元字符
?
、+
、{
、|
、(
和)
失去它们的特殊意义;而是使用反斜杠版本?,+
,{
,|
,(
,)
.
所以使用反斜线版本:
$ echo aaab | grep '^[ab]+$'
aaab
或者激活扩展语法:
$ echo aaab | egrep '^[ab]+$'
aaab
用反斜杠掩码,或者用扩展的grep掩码,别名grep -e
:
echo "aaab" | egrep '^[ab]+$'
aaab
echo "aaab" | grep '^[ab]+$'
aaab