错误:使用正则表达式的 Python 脚本中的错误转义



我正在尝试在列表之间进行搜索,并在匹配时返回值,并在不匹配时返回值。

import re
array = ['brasil','argentina','chile','canada']
array2 = ['brasil.sao_paulo','chile','argentina']
for x,y in zip(array,array2):
if re.search('\{}\b'.format(x), y, re.IGNORECASE):
print("Match: {}".format(x))
else:
print("Not match: {}".format(y))

输出:

Not match: brasil.sao_paulo
Not match: chile
Traceback (most recent call last):
File "main.py", line 7, in <module>
if re.search('\{}\b'.format(x), y, re.IGNORECASE):
File "/usr/local/lib/python3.7/re.py", line 183, in search
re.error: bad escape c at position 0

期望输出:

Match: brasil
Match: argentina
Match: chile
Not match: canada

如果我理解正确,这里不需要正则表达式。

group_1 = ['brasil','argentina','chile','canada']
group_2 = ['brasil.sao_paulo','chile','argentina']
for x in group_1:
# For group 2 only, this picks out the part of the string that appears before the first ".".
if x in [y.split('.')[0] for y in group_2]:
print("Match: {}".format(x))
else:
print("Not match: {}".format(x))

返回

Match: brasil
Match: argentina
Match: chile
Not match: canada

如果你zip,你只会得到成对匹配。鉴于搜索的性质,您可以将干草堆连接成一个空格分隔的字符串,并将针连接成一个交替的图案,然后让findall嘟嘟囔囔地走:

>>> import re
>>> needles = ['brasil', 'argentina', 'chile', 'canada']
>>> haystack = ['brasil.sao_paulo', 'chile', 'argentina']
>>> re.findall(r"b%sb" % "|".join(needles), " ".join(haystack), re.I)
['brasil', 'chile', 'argentina']

原始正则表达式中\背后的意图尚不清楚,因此我假设您希望在模式的两边都b

使用any方法的简单解决方案:

array = ['brasil', 'argentina', 'chile', 'canada']
array2 = ['brasil.sao_paulo', 'chile', 'argentina']
for x in array:
if any(x.casefold() in y.casefold() for y in array2):
print("Match:", x)
else:
print("Not match:", x)

在线试用!

编辑:使用casefold()使其不区分大小写。

最新更新