Python LXML-选择XPATH而不双重斜线

有关xPath的文档指出，如果xpath中没有斜线，则表达式将在任何地方选择元素。

但是，使用python中的lxml.html尝试这样做：

import requests
import lxml.html
s = requests.session()
page= s.get('http://lxml.de/')
html = lxml.html.fromstring(page.text)
p=html.xpath('p')

这里p是一个空列表。

我需要使用p=html.xpath('//p')。

有人知道为什么吗？

页面可能不使用 <p>（即根），而是您使用该XPath表达式假设的<html>。

使用Double Slash，//p，以检索所有<p>元素，或者绝对引用特定的<p>。下面用第一段内容演示：

p = html.xpath('/html/body/div/p')
print(p[0].text)
# lxml is the most feature-rich
# and easy-to-use library
# for processing XML and HTML
# in the Python language.

等效：

p = html.xpath('//p')
print(p[0].text)    
# lxml is the most feature-rich
# and easy-to-use library
# for processing XML and HTML
# in the Python language.

解析<p>无前向斜线，这需要先前的XPath，search Path slashes：

div = p = html.xpath('/html/body/div')[0]    
p = div.xpath('p')
print(p[0].text)
# lxml is the most feature-rich
# and easy-to-use library
# for processing XML and HTML
# in the Python language.

相关内容

最新更新

热门标签：