我正在尝试在两个HTML注释之间选择一些内容,但是在正确处理时遇到了一些麻烦(如"XPath在两个HTML注释之间进行选择?"中所示)。当新评论在同一行上时,似乎存在问题。
我的网页:
<html>
........
<!-- begin content -->
<div>some text</div>
<div>
<p>Some more elements</p>
</div>
<!-- end content --><!-- begin content -->
<div>more text</div>
<!-- end content -->
.......
</html>
我使用:
doc.xpath("//node()[preceding-sibling::comment()[. = ' begin content ']]
[following-sibling::comment()[. = ' end content ']]")
结果:
<div>some text</div>
<div>
<p>Some more elements</p>
</div>
<!-- end content --><!-- begin content -->
<div>more text</div>
我想得到什么:
<div>some text</div>
<div>
<p>Some more elements</p>
</div>
如果您对第一对评论感兴趣,可以从查找第一条评论开始:
//comment()[.=' begin content ']/following::*[not(preceding::comment()[.=' end content '])]
即:
//comment()[1][.=' begin content '] <-- look for first suitable comment
/following::* <-- take all following nodes
[not(preceding::comment()[.=' end content '])] <-- satisfying condition there is no preceding "end comment"