sre_instants.error:jython中没有可重复的内容



我有html内容我想从这个内容获得评论

content = """<html>
<body>
<!--<h1>test</h1>-->
<!--<div>
<img src='x'>
</div>-->
Blockquote
<!--
<div>
<img src='xe'>
</div>
-->
</body>
</html>"""

我使用这个正则表达式

regex_str = "<!--((n|r)+)?((.*?)+((n|r)+)?)+-->"

Python中运行此线路时

re.findall(regex_str,content)

工作成功

但在jython中运行时出现此错误

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/share/java/jython-2.7.2.jar/Lib/re$py.class", line 177, in findall
File "/usr/share/java/jython-2.7.2.jar/Lib/re$py.class", line 242, in _compile
sre_constants.error: nothing to repeat

使用

<!--[nr]*([wW]*?)[nr]*-->

请参阅正则表达式证明。

解释

--------------------------------------------------------------------------------
<!--                     '<!--'
--------------------------------------------------------------------------------
[nr]*                  any character of: 'n' (newline), 'r'
(carriage return) (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
(                        group and capture to 1:
--------------------------------------------------------------------------------
[wW]*?                 any character of: word characters (a-z,
A-Z, 0-9, _), non-word characters (all
but a-z, A-Z, 0-9, _) (0 or more times
(matching the least amount possible))
--------------------------------------------------------------------------------
)                        end of 1
--------------------------------------------------------------------------------
[nr]*                  any character of: 'n' (newline), 'r'
(carriage return) (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
-->                      '-->'

最新更新