(等效于)"backreferences within character class"的一般方法？

在Perl正则表达式中，1、2等表达式通常被解释为对以前捕获的组的"反向引用"，但当1、2等出现在字符类中时，情况并非如此。在后一种情况下，被视为转义字符（因此1只是1，等等）

因此，如果（例如）想要匹配一个字符串（长度大于1），该字符串的第一个字符与最后一个字符匹配，但没有出现在字符串中的其他任何位置，则以下正则表达式将而不是执行：

/A       # match beginning of string;
 (.)      # match and capture first character (referred to subsequently by 1);
 [^1]*   # (WRONG) match zero or more characters different from character in 1;
 1       # match 1;
 z       # match the end of the string;
/sx       # s: let . match newline; x: ignore whitespace, allow comments

不会工作，因为它匹配（例如）字符串'a1a2a':

  DB<1> ( 'a1a2a' =~ /A(.)[^1]*1z/ and print "fail!" ) or print "success!"
fail!

我通常可以找到一些变通方法¹，但它总是针对特定的问题，而且通常比在字符类中使用反向引用要复杂得多。

是否有通用（希望是直接的）解决方法

_{¹例如，对于上面例子中的问题，我会使用类似}的东西

/A
 (.)              # match and capture first character (referred to subsequently
                  # by 1);
 (?!.*1.+z)    # a negative lookahead assertion for "a suffix containing 1";
 .*               # substring not containing 1 (as guaranteed by the preceding
                  # negative lookahead assertion);
 1z             # match last character only if it is equal to the first one
/sx

_{。。。在这里，我用更令人生畏的否定前瞻断言(?!.*1.+z)替换了早期regex中相当简单（尽管不正确）的子表达式[^1]*。这个断言基本上是说"如果1出现在这一点之外的任何地方（除了最后一个位置），就放弃。"顺便说一句，我给出这个解决方案只是为了说明我在问题中提到的那种变通方法。我不认为这是一个特别好的}

这可以通过在重复组中进行负面前瞻来实现：

/A         # match beginning of string;
 (.)        # match and capture first character (referred to subsequently by 1);
 ((?!1).)* # match zero or more characters different from character in 1;
 1         # match 1;
 z         # match the end of the string;
/sx

即使组包含多个字符，也可以使用此模式。

相关内容

最新更新

热门标签：

(等效于)"backreferences within character class"的一般方法 ？

相关内容

最新更新

热门标签：

(等效于)"backreferences within character class"的一般方法？