捕获方括号内的数字



我想捕获方括号内的所有数字。数字之间用逗号分隔。例如,我想从文本some text [7, 8], some other texts with 1 or 2 numbers [5]. other texts.

中捕获7,8和5我尝试使用以下模式

pat = (?<=[)[d,s]*(d)[d,s]*(?=])

但似乎对于";[7,8]&;"图案重叠,我只得到"8"。

使用向后看和向前看,恕我直言,过度使用正则表达式。最好是捕捉整个模式,然后去掉第一个和最后一个括号。像这样的代码更容易遵循和理解:

import re
sample = r"""
some text [7,8], some other [2, 3] texts with 1 or 2 numbers [5]. [4,
5] other texts
"""
result = [ s[1:-1] for s in re.findall(r'[d+s*(?:,s*d+)*]', sample) ]
print(result)

如果您真的想使用正则表达式来捕获结果,可以这样做:

result = re.findall(r'[(d+s*(?:,s*d+)*)]', sample)
print(result)

With PyPi regex:

import regex
pat = r'[(?P<numbers>d+)(?:,s*(?P<numbers>d+))*]'
s = r'some text [7, 8], some other texts with 1 or 2 numbers [5]. other texts.'
results = [match.captures('numbers') for match in regex.finditer(pat, s)]
print(results)

参见Python证明。

结果:[['7', '8'], ['5']].

表达式解释

--------------------------------------------------------------------------------
[                       '['
--------------------------------------------------------------------------------
(?P<numbers>               group and capture to "numbers":
--------------------------------------------------------------------------------
d+                      digits (0-9) (1 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
)                        end of k<numbers>
--------------------------------------------------------------------------------
(?:                      group, but do not capture (0 or more times
(matching the most amount possible)):
--------------------------------------------------------------------------------
,                        ','
--------------------------------------------------------------------------------
s*                      whitespace (n, r, t, f, and " ") (0
or more times (matching the most amount
possible))
--------------------------------------------------------------------------------
(?P<numbers>               group and capture to "numbers":
--------------------------------------------------------------------------------
d+                      digits (0-9) (1 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
)                        end of k<numbers>
--------------------------------------------------------------------------------
)*                       end of grouping
--------------------------------------------------------------------------------
]                        ']'

最新更新