为什么它匹配两个属性 - 正则表达式 - Why is it matching two Properties

我有一个正则表达式，我想在其中获取包含%%text%的属性，但我的正则表达式捕获的不止这些

我的正则表达式：(<Properties>).+?%%.+?%%.+?(</Properties>)

它捣碎：

"<Properties>
<Property>TEXT</Property>
</Properties>
<Properties>
<Property >%%TEXT%%</Property>
</Properties>"

但我只希望他匹配：

"<Properties>
<Property >%%TEXT%%</Property>
</Properties>"

我做错了什么？

使用缓和的贪婪令牌代替.：

<Properties>(?:(?!</Properties>)[^])*%%(.+?)%%(?:(?!</Properties>)[^])*</Properties>

这部分(?:(?!</Properties>)[^])确保我们在想要的文本之前没有</properties>。

[^]代表任何字符，包括换行符。

演示

让我们分解正则表达式与实际匹配，以便了解它为什么匹配：

(<Properties>).+?%%.+?%%.+?(</Properties>)

相反，您希望使正则表达式更加明确：

(?:[^<%]|%(?!%)|<(?!/Properties>))

上面将匹配一个不<或%的字符，如果是这两个字符之一，则仅匹配%如果没有后跟另一个%，并且如果不后跟/Properties>，则只会匹配<。这应该用作您的.的替代品。结果是：

(<Properties>)(?:[^<%]|%(?!%)|<(?!/Properties>))+%%(?:[^<%]|%(?!%)|<(?!/Properties>))+%%(?:[^<%]|%(?!%)|<(?!/Properties>))+(</Properties>)

由于正则表达式更明确，我可以安全地删除懒惰的?量词修饰符。

为什么它匹配两个属性 - 正则表达式