html/xhtml 文件名的正则表达式

>我正在尝试为以下内容制定一个正则表达式：

"任意数量的字符，'

ch01'或'chapter01'之前的任何字符，下一个字符不能是数字，那么任何数量的字符，任何字符直到一个句点，之后必须有一个HTML或XHTML">

抱歉，如果这令人困惑，但一些测试用例可能会有更好的帮助：

x = 'fdsafafsdch01fdsfdsf.xhtml' #pass
y = '9781599048970_ch01__s1_002.html' #pass
z = 'ch01.html' #pass
a = 'chapter019.xhtml' #fail
l = 'chapter01.html' #pass
m = 'chapter010-fn.xhtml' #fail
matches = [x, y, z, a, l, m]
for item in matches:
  print(bool(re.search('ch(apter)?01D?.*.x?html',  item)))

(#fail == False ， #pass == True (

目前，所有病例都在返回True

问题似乎出在D? 上。这意味着"零个或一个非数字"，所以正则表达式chapter019解析为"chapter01"，后跟零个非数字，后跟一个字符，它很高兴地匹配。尝试让?同时影响D和随后的.*。

for item in matches:
  print(bool(re.search('ch(apter)?01(D.*)?.x?html',  item)))

结果：

True
True
True
False
True
False

相关内容

最新更新

热门标签：