我正在尝试一个正则表达式来找到类似的"保持秩序";在"领域"中发现的模式/单词并使用它来查找"文本"中的匹配项。我想写这样一个正则表达式,它也会像下面的例子1一样找到部分匹配。
也许,使单词可选是一种方法,但它开始匹配任意的东西。
我想让你们帮我写一个正则表达式,它接受"字段"并从中生成一个正则表达式,然后在"文本"中找到该模式。部分匹配也可以。
两个字符串输入都可以是任何值,正则表达式应该足够通用,可以处理任何值。
如果需要,请提出澄清问题!如果你能指出我哪里做错了,那就太有帮助了。
def regexp(field, text):
import re
key = re.split('W', field)
regex= "^.*"
for x in key:
if len(x)>0:
#regex += "("+x+")?"
regex += x
regex += ".*"
regex = r'{}'.format(regex)
pattern = re.compile(regex, re.IGNORECASE)
matches = list(re.finditer(pattern, text))
print(matches, "n", pattern)
if len(matches)>0:
return True
else:
return False
例子:
print(regexp("F1 gearbox: 0-400 m","f1 gearbox")) # this should match
#this is a partial match, my regex should be able to find this match
print(regexp("0-100 kmph" , "100-0 kmph")) # this should not match
#order of characters/words in my regex/text should match
print(regexp("F1 gearbox: 0-400 m","none")) # this should not match
#if i try use "(word)?" in my regex then everything becomes optional #and it starts to match random words like "none","sbhsuckjcsak", etc. #this obviously is not expected.
print(regexp("Combined* (ECE+EUDC) (l/100 km)","combined ece eudc")) #this should match
#because its a partial match and special characters are not important #for my matching usecase
您的函数已经为您发布的示例返回正确的值。你只需要调整一下& text"的顺序和";field"。我还使代码更短,(至少在我看来)更容易阅读:
def regexp(text, field):
import re
key = re.split('W', field)
regex = rf'^.*{".*".join(key)}'
pattern = re.compile(regex, re.IGNORECASE)
matches = re.findall(pattern, text)
# print(matches, "n", pattern)
return len(matches)>0