XML:
<body>
<h2><font style="font-weight: bold">Baz</font></h2>
<p><img title="image" /></p>
<p>Baz 0 with an <a href="http://">anchor</a> element.</p>
<p>Baz 1 with an <a href="http://">anchor</a> element.</p>
<hr />
<h2><font style="font-weight: bold">People</font></h2>
<ul>
<li>People 0 with <a href="http://" >an anchor</a> element.</li>
<li>People 1 with an <a href="http://" >an anchor</a> element.</li>
</ul>
<hr/>
<h2><font style="font-weight: bold">Sales</font></h2>
<ul>
<li>List item 2 with an <a href="http://" >an anchor</a> element.</li>
<li>List item 3 with an <a href="http://" >an anchor</a> element.</li>
<li>List item 4 without an anchor element.</li>
</ul>
<hr />
<h2><font style="font-weight: bold">Sales</font></h2>
<p><img title="image" /></p>
<p>sales 0 with an <a href="http://">anchor</a> element.</p>
<p>sales 1 with an <a href="http://">anchor</a> element.</p>
<hr />
<h2><font style="font-weight: bold">Foo</font></h2>
<ul>
<li>Foo 0 with <a href="http://" >an anchor</a> element.</li>
<li>Foo 1 with an <a href="http://" >an anchor</a> element.</li>
</ul>
<hr />
<h2><font style="font-weight: bold">Bar</font></h2>
<p><img title="image" /></p>
<p>bar 0 with an <a href="http://">anchor</a> element.</p>
<p>bar 1 with an <a href="http://">anchor</a> element.</p>
<hr />
</body>
此路径://p[a and preceding-sibling::h2[font[text()='Sales']][1] and following-sibling::hr[1]]
返回:
<p>sales 0 with an <a href="http://">anchor</a> element.</p>
<p>sales 1 with an <a href="http://">anchor</a> element.</p>
<p>bar 0 with an <a href="http://">anchor</a> element.</p>
<p>bar 1 with an <a href="http://">anchor</a> element.</p>
期望p
:
<p>sales 0 with an <a href="http://">anchor</a> element.</p>
<p>sales 1 with an <a href="http://">anchor</a> element.</p>
期望li
:
<li>List item 2 with an <a href="http://" >an anchor</a> element.</li>
<li>List item 3 with an <a href="http://" >an anchor</a> element.</li>
我错过了什么?
我将如何更改 xpath 以包含li/[a]
的方式与我包含p/[a]
相同的方式?preceding/following-sibling
不适用于li
.
你只需要指定它是前面的第一个同级h2:
preceding-sibling::h2[1]
更新了 xpath(我还简化了Sales
的测试(:
//p[a and preceding-sibling::h2[1][.='Sales'] and following-sibling::hr]
另外,如果您需要确定第一个不p
的后续兄弟姐妹是hr
,您可以尝试这个......
//p[a and preceding-sibling::h2[1][.='Sales'] and following-sibling::*[not(self::p)][1][self::hr]]
如果您尝试选择除p
之外的li
,您可以更新 xpath 以使用preceding::
和following::
,但您必须考虑任何可能显示为p
子元素的元素,如a
、span
等......
//*[self::p or self::li][a and preceding::h2[1][.='Sales'] and following::*[not(self::p) and not(self::li) and not(self::a)][1][self::hr]]
这将从您的示例 XML 中选择以下内容...
<li>List item 2 with an <a href="http://" >an anchor</a> element.</li>
<li>List item 3 with an <a href="http://" >an anchor</a> element.</li>
<p>sales 0 with an <a href="http://">anchor</a> element.</p>
<p>sales 1 with an <a href="http://">anchor</a> element.</p>
但是,我建议使用第二个 xpath 来专门针对li
......
//li[a and preceding::h2[1][.='Sales'] and ../following-sibling::*[1][self::hr]]