如何在字符类中使用加号作为正则表达式的一部分

在cygwin中，这不会返回匹配:

$ echo "aaab" | grep '^[ab]+$'

但这确实返回一个匹配:

$ echo "aaab" | grep '^[ab][ab]*$'
aaab

两个表达式不相同吗?是否有任何方法来表达"字符类的一个或多个字符"，而不键入字符类两次(如在秒的例子)?

根据这个链接，这两个表达式应该是相同的，但也许Regular-Expressions.info没有覆盖cygwin中的bash。

grep具有多个匹配的"模式"，并且默认情况下仅使用基本集，该集不识别许多元字符，除非它们被转义。您可以将grep置于扩展模式或perl模式，以便对+求值。

From man grep:

Matcher Selection
  -E, --extended-regexp
     Interpret PATTERN as an extended regular expression (ERE, see below).  (-E is specified by POSIX.)
  -P, --perl-regexp
     Interpret PATTERN as a Perl regular expression.  This is highly experimental and grep -P may warn of unimplemented features.

Basic vs Extended Regular Expressions
  In basic regular expressions the meta-characters ?, +, {, |, (, and ) lose their special meaning; instead use the backslashed versions ?, +, {, |, (, and ).
  Traditional egrep did not support the { meta-character, and some egrep implementations support { instead, so portable scripts should avoid { in grep -E patterns and should use [{] to match a literal {.
  GNU  grep -E attempts to support traditional usage by assuming that { is not special if it would be the start of an invalid interval specification.  For example, the command grep -E '{1' searches for the two-character string {1 instead of reporting a syntax
       error in the regular expression.  POSIX.2 allows this behavior as an extension, but portable scripts should avoid it.

或者，您可以使用egrep代替grep -E。

基本正则表达式中的元字符?、+、{、|、(和)失去它们的特殊意义;而是使用反斜杠版本?，+, {, |, (, ) .

所以使用反斜线版本:

$ echo aaab | grep '^[ab]+$'
aaab

或者激活扩展语法:

$ echo aaab | egrep '^[ab]+$'
aaab

用反斜杠掩码，或者用扩展的grep掩码，别名grep -e:

echo "aaab" | egrep '^[ab]+$'

aaab

echo "aaab" | grep '^[ab]+$'

aaab

相关内容

最新更新

热门标签：