我遇到了一个问题,我需要排除一些"price"类标签,这些标签具有高于自己的"promoted list"类。以下是的示例
<table class="promoted-list">
<td>
<p class="price">I dont want this one</p>
</td>
</table>
<table>
<td>
<p class="price">I want this one</p>
</td>
</table>
我不能通过XPath访问这个1000使用:
//p[contains(@class, 'price') and not(contains(@class, 'promoted-list'))]
它只是不想排除这个,有人有解决方案吗?在这种情况下,输出应该是"我想要这个">
注意你发布的网站链接,使用这些XPaths来获得你需要的东西(只有白色矩形背景的价格(。
注:
有些文章没有价格,所以XPaths应该总是为文章和价格返回相同的数字(文章数量=价格数量(。"ZamieniÉ"被排除在外。
//td[normalize-space(@class)="offer" and contains(.,"zł")]//h3//strong/text()
//td[normalize-space(@class)="offer" and contains(.,"zł")]//p/strong/text()
如果你想保留"ZamieniÉ":
//td[normalize-space(@class)="offer"][*//p[@class="price"]]//h3//strong/text()
//td[normalize-space(@class)="offer"]//p[@class="price"]/strong/text()
给定一个格式良好的示例XML文档,例如
<root>
<table class="promoted-list">
<td>
<p class="price">I dont want this one</p>
</td>
</table>
<table>
<td>
<p class="price">I want this one</p>
</td>
</table>
</root>
实现这一点的XPath表达式是:
//table[not(contains(@class, 'promoted-list'))]//p[contains(@class, 'price')]
在通俗英语中,它的意思是
//table[not(contains(@class, 'promoted-list'))]//p[contains(@class, 'price')]
select all `table` elements,
but only if they do not have a `class` attribute whose value includes "promoted-list
of the remaining `table` elements, select all `p` descendant elements
but only if they have a `class` attribute whose value contains "price"
输出
<p class="price">I want this one</p>