Bash 是否支持非贪婪正则表达式?

为什么我的正则表达式模式不懒惰？它应该捕获第一个数字，而不是第二个数字。

这是一个有效的 bash 脚本。

#!/bin/bash
text='here is some example text I want to match word1 and this number 3.01&nbsp;GiB here is some extra text and another number 1.89&nbsp;GiB'
regex='(word1|word2).*?number[[:blank:]]([0-9.]+)&nbsp;GiB'
if [[ "$text" =~ $regex ]]; then
echo 'FULL MATCH:  '"${BASH_REMATCH[0]}"
echo 'NUMBER CAPTURE:  '"${BASH_REMATCH[2]}"
fi

这是输出...

FULL MATCH:  word1 and this number 3.01&nbsp;GiB here is some extra text and another number 1.89&nbsp;GiB
NUMBER CAPTURE:  1.89

使用这个在线 POSIX 正则表达式测试器，正如我所期望的那样，它很懒惰。但在 Bash 中，它是贪婪的。数字捕获应为 3.01，而不是 1.89。

Wrt.*?， POSIX 标准说

多个相邻重复符号("+"、"*"、"？"和间隔(的行为会产生未定义的结果。

关于贪婪匹配，它说：

如果模式允许可变数量的匹配字符，因此从该点开始有多个这样的序列，则匹配最长的此类序列。

在这种特殊情况下，您可以改用[^&]*。

text='here is some example text I want to match word1 and this number 3.01&nbsp;GiB here is some extra text and another number 1.89&nbsp;GiB'
regex='(word1|word2)[^&]*number[[:blank:]]([0-9.]+)&nbsp;GiB'
if [[ "$text" =~ $regex ]]; then
echo 'FULL MATCH:  '"${BASH_REMATCH[0]}";
echo 'NUMBER CAPTURE:  '"${BASH_REMATCH[2]}";
fi

输出：

FULL MATCH:  word1 and this number 3.01&nbsp;GiB
NUMBER CAPTURE:  3.01

相关内容

最新更新

热门标签：