我正在尝试进行4行多行匹配。我的代码找到了第一个。但不是其他。
这是模式:
pattern = re.compile("([a-z]+.com.|net.)[.sS]+(Z[A-Z0-9]+)")
这是主题:
sub = """yahoo.com.
Public
8
Z2RVE9XGX4PFJN
google.com.
Public
7
Z2VATLWTLBDR5D
"""
这是完整的代码:
import re
pattern = re.compile("([a-z]+.com.|net.)[.sS]+(Z[A-Z0-9]+)")
sub = """yahoo.com.
Public
8
Z2RVE9JJGX4PFJN
google.com.
Public
7
Z2VATZOPLBDR5D
"""
m = pattern.findall(sub)
print(m)
这是结果:
[('yahoo.com.', 'Z2RVE9JJGX4PFJN')]
最后,这是所需的结果:
[('yahoo.com.', 'Z2RVE9JJGX4PFJN'), ('google.com', Z2VATZOPLBDR5D')]
谢谢。
您很接近。只需使您的比赛减少贪婪:
import re
pattern = re.compile("([a-z]+.com.|net.)[sS]+?(Z[A-Z0-9]+)")
# Note the 'less greedy' addition ^
# The '.' is not necessary in the ^ in the character class
sub = """yahoo.com.
Public
8
Z2RVE9JJGX4PFJN
google.com.
Public
7
Z2VATZOPLBDR5D
"""
m = pattern.findall(sub)
print(m)
打印:
[('yahoo.com.', 'Z2RVE9JJGX4PFJN'), ('google.com.', 'Z2VATZOPLBDR5D')]
为了更特异性,您可能需要使用锚点:
pattern = re.compile("^([a-z]+.com.|net.)$[sS]+?^(Z[A-Z0-9]+)$", re.M)
# Start of line ^ ^
# End of line ^ ^
# Multi line flag ^