Python3无法使用lxml.etree.find获取XML元素值



我正在尝试处理POST响应,其中我得到了一个XML。结果保存为字节b'':

<?xml version="1.0" encoding="utf-8"?>
<result xmlns="http://something.com/Schema/V2/Result">
<success>false</success>
<returnType>ERROR</returnType>
<errors>
<error>
<message>Invalid signature</message>
<code>3002</code>
</error>
</errors>
</result>

代码:

from lxml import etree as et
root_node = et.fromstring(response.content)
print('{}'.format(root_node.find('.//returnType')))
return_type = root_node.find('.//returnType').text

print语句返回None,所以find((.text抛出异常。

如果我用for遍历子级,我会得到节点,但名称空间我无法处理。

for tag in root_node.getchildren():
print(tag)
<Element {http://something.com/Schema/V2/Result}returnType at 0x7f6c95542648>

如何获取XML节点及其值?对于类似的问题,我尝试过堆叠式的答案,但都不起作用。尝试使用regex删除模式并向NS添加前缀。

编辑:尝试了答案,得到了我无法得到节点的标准错误。

/usr/bin/python3 /home/samoa/Scripts/Python/lxml_test.py
Traceback (most recent call last):
File "/home/samoa/Scripts/Python/lxml_test.py", line 17, in <module>
print(root.find("returnType", root.nsmap).text)
File "src/lxml/lxml.etree.pyx", line 1537, in lxml.etree._Element.find (src/lxml/lxml.etree.c:58520)
File "/usr/local/lib/python3.6/dist-packages/lxml/_elementpath.py", line 288, in find
it = iterfind(elem, path, namespaces)
File "/usr/local/lib/python3.6/dist-packages/lxml/_elementpath.py", line 277, in iterfind
selector = _build_path_iterator(path, namespaces)
File "/usr/local/lib/python3.6/dist-packages/lxml/_elementpath.py", line 234, in _build_path_iterator
raise ValueError("empty namespace prefix is not supported in ElementPath")
ValueError: empty namespace prefix is not supported in ElementPath

将命名空间映射传递给find()方法。由于http://something.com/Schema/V2/Result是文档中的默认名称空间,因此您只需执行以下操作:

return_type_element = root_node.find('.//returnType', root_node.nsmap)

或:

return_type_element = root_node.find('returnType', root_node.nsmap)

此外,中的str.format()

print('{}'.format(root_node.find('.//returnType')))

是不必要的,可以缩短为:

return_type_element = root_node.find('returnType', root_node.nsmap)
print(return_type_element)
# <Element {http://something.com/Schema/V2/Result}returnType at 0x107c28bc0>

但是,如果要将return_type_element打印为XML,请使用lxml.etree.tostring()函数:

print(ET.tostring(return_type_element))
# b'<returnType xmlns="http://something.com/Schema/V2/Result">ERROR</returnType>n    '

因此,您的return_type可以通过以下途径获得:

return_type = root_node.find('returnType', root_node.nsmap).text

我的测试脚本是:

#!/usr/bin/env python3
from lxml import etree as ET
content = b'''<?xml version="1.0" encoding="utf-8"?>
<result xmlns="http://something.com/Schema/V2/Result">
<success>false</success>
<returnType>ERROR</returnType>
<errors>
<error>
<message>Invalid signature</message>
<code>3002</code>
</error>
</errors>
</result>
'''
root = ET.fromstring(content)
emptyns = root.nsmap[None]
print(root.find("{%s}returnType" % (emptyns)).text)
# step-by-step
root = ET.fromstring(content)
print("Root element: %s" % (root))
emptyns = root.nsmap[None]
print("Empty namespace: %s" % (emptyns))
return_type_element = root.find("{%s}returnType" % (emptyns))
print("<returnType> element: %s" % (return_type_element))
print("<returnType> element as XML: %s" % (ET.tostring(return_type_element)))
return_type = return_type_element.text
print('<returnType> text: %s' % (return_type))
# children
for element in root.getchildren():
print("Element tag (with namespace): %s" % (element.tag))
_, _, tag = element.tag.rpartition("}")
print("Element tag (without namespace): %s" % (tag))

其结果是:

ERROR
Root element: <Element {http://something.com/Schema/V2/Result}result at 0x102f63188>
Empty namespace: http://something.com/Schema/V2/Result
<returnType> element: <Element {http://something.com/Schema/V2/Result}returnType at 0x102f630c8>
<returnType> element as XML: b'<returnType xmlns="http://something.com/Schema/V2/Result">ERROR</returnType>n    '
<returnType> text: ERROR
Element tag (with namespace): {http://something.com/Schema/V2/Result}success
Element tag (without namespace): success
Element tag (with namespace): {http://something.com/Schema/V2/Result}returnType
Element tag (without namespace): returnType
Element tag (with namespace): {http://something.com/Schema/V2/Result}errors
Element tag (without namespace): errors

最新更新