我想从不以order
和cart
开头、不应以dollar
结尾的字符串中提取数字。
我有以下代码:
import re
pattern = "(^.*b(?:cart|order)b.*)(bd{2,12}b)"
text1 = ('the cart number is 1234 and 4567 that it!')
text2 = ("order number note down 12345")
text3 = "credit card is 0000 and 3456"
text4 = "the 4567"
text5 = "i got 245 dollar"
result = re.search(pattern, text4)
print(result)
预期输出匹配
text3 0000 3456
text4 4567
我们能否通过使用re
模块一次传递一个示例来实现预期输出?
您可以使用两个regexp来完成此操作,一个regexp将检查开头是否有cart
或order
,结尾是否有可选的文章或dollar
,另一个regexps将提取数字:
import re
pattern = re.compile( r"bd{2,12}b" )
pattern_valid = re.compile( r"^s*(?:(the|an?)s+)?(cart|order)b|bdollars*$", re.I )
texts = ['the cart number is 1234 and 4567 that it!',
"order number note down 12345",
"credit card is 0000 and 3456",
"the 4567",
"i got 245 dollar"]
for text in texts:
if not pattern_valid.search(text):
print(text, pattern.findall(text), sep=" => ")
请参阅Python演示。输出:
credit card is 0000 and 3456 => ['0000', '3456']
the 4567 => ['4567']
^s*(?:(the|an?)s+)?(cart|order)b|bdollars*$
正则表达式在开头匹配可选的the
或a
/an
项目,在字符串末尾匹配单词order
或cart
或单词dollar
。如果匹配,则不处理该字符串。如果没有匹配项,第二个正则表达式将从字符串中提取所有2-12位数字,这些数字是整个单词。