捕获有效的逗号分隔数字python正则表达式

我正在处理一个多行字符串，试图在字符串中捕获有效的逗号分隔数字。

例如：

my_string = """42     <---capture 42 in this line
1,234    <---capture 1,234 in this line
3,456,780    <---capture 3,456,780 in this line
34,56,780    <---don't capture anything in this line but 34 and 56,780 captured
1234    <---don't capture anything in this line but 123 and 4 captured
"""

理想情况下，我希望re.findall返回：

['42', '1,234', '3,456,780']

这是我的代码：

a = """
42
1,234
3,456,780
34,56,780
1234
"""
regex = re.compile(r'd{1,3}(?:,d{3})*')
print(regex.findall(a))

我上面代码的结果是：

['42', '1,234', '3,456,780', '34', '56,780', '123', '4']

但我想要的输出应该是：

['42', '1,234', '3,456,780']

如果只想捕获与模式匹配的整行，则需要用^和$锚定正则表达式，并使用re.MULTILINE标志使它们匹配行的开头/结尾，而不仅仅是字符串的开头/末尾。

regex = re.compile(r'^d{1,3}(?:,d{3})*$', re.MULTILINE)

使用查找来确保数字前后没有数字或逗号：

import re
a = """
42
1,234
3,456,780
34,56,780
1234
"""
regex = re.compile(r'(?<![d,])d{1,3}(?:,d{3})*(?![d,])')
print(regex.findall(a))

输出：

['42', '1,234', '3,456,780']

相关内容

最新更新

热门标签：