我正在尝试使用Python和lxml
解析以下XML:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="/bind9.xsl"?>
<isc version="1.0">
<bind>
<statistics version="2.2">
<memory>
<summary>
<TotalUse>1232952256
</TotalUse>
<InUse>835252452
</InUse>
<BlockSize>598212608
</BlockSize>
<ContextSize>52670016
</ContextSize>
<Lost>0
</Lost>
</summary>
</memory>
</statistics>
</bind>
</isc>
目标是提取bind/statistics/memory/summary
下每个元素的标记名称和文本,以生成以下映射:
TotalUse: 1232952256
InUse: 835252452
BlockSize: 598212608
ContextSize: 52670016
Lost: 0
我已经设法提取了元素值,但我无法弄清楚 xpath 表达式来获取元素标签名称。
示例脚本:
from lxml import etree as et
def main():
xmlfile = "bind982.xml"
location = "bind/statistics/memory/summary/*"
label_selector = "??????" ## what to put here...?
value_selector = "text()"
with open(xmlfile, "r") as data:
xmldata = et.parse(data)
etree = xmldata.getroot()
statlist = etree.xpath(location)
for stat in statlist:
label = stat.xpath(label_selector)[0]
value = stat.xpath(value_selector)[0]
print "{0}: {1}".format(label, value)
if __name__ == '__main__':
main()
我知道我可以使用value = stat.tag
而不是stat.xpath()
,但是脚本必须足够通用,以便还可以处理标签选择器不同的其他XML片段。
哪个 xpath 选择器会返回元素的标签名称?
只需使用 XPath 的name()
并删除零索引,因为这会返回一个字符串而不是列表。
from lxml import etree as et
def main():
xmlfile = "ExtractXPathTagName.xml"
location = "bind/statistics/memory/summary/*"
label_selector = "name()" ## what to put here...?
value_selector = "text()"
with open(xmlfile, "r") as data:
xmldata = et.parse(data)
etree = xmldata.getroot()
statlist = etree.xpath(location)
for stat in statlist:
label = stat.xpath(label_selector)
value = stat.xpath(value_selector)[0]
print("{0}: {1}".format(label, value).strip())
if __name__ == '__main__':
main()
输出
TotalUse: 1232952256
InUse: 835252452
BlockSize: 598212608
ContextSize: 52670016
Lost: 0
我认为这两个值不需要XPath,元素节点具有属性tag
和text
因此例如使用列表推导:
[(element.tag, element.text) for element in etree.xpath(location)]
或者如果你真的想使用 XPath
result = [(element.xpath('name()'), element.xpath('string()')) for element in etree.xpath(location)]
当然,您也可以构建字典列表:
result = [{ element.tag : element.text } for element in root.xpath(location)]
或
result = [{ element.xpath('name()') : element.xpath('string()') } for element in etree.xpath(location)]