我使用minidom读取我的XML文件,但是对于下面的示例,这不起作用。我有一个错误消息:
我想检索<span>
标记(101.86090
)中的值,但我有一个错误。
这是代码:
from xml.dom import minidom
docXML = minidom.parse('/root/Desktop/tpage.xml')
node = docXML.getElementsByTagName('span')[0]
t= node.firstChild.data
tpage.xml
的内容:
<span class="lp">
<span sys:innerhtml="{binding Last}"
sys:codeafter="$.quotebroker.setTitleProperties($dataItem, 'Last')">
101.86090
</span>
</span>
,这是错误信息:
File "minidomrecup.py", line 5, in <module>
dom = parse('/root/Desktop/bot/tpage.xml')
File "/usr/lib/python2.7/xml/dom/minidom.py", line 1920, in parse
return expatbuilder.parse(file)
File "/usr/lib/python2.7/xml/dom/expatbuilder.py", line 924, in parse
result = builder.parseFile(fp)
File "/usr/lib/python2.7/xml/dom/expatbuilder.py", line 207, in parseFile
parser.Parse(buffer, 0)
xml.parsers.expat.ExpatError: unbound prefix: line 2, column 0
显示的XML无效,因为它使用名称空间前缀(sys
),但没有定义它,并且XML解析器(xml.dom.expatbuilder
模块)阻塞在此。您必须直接到expatbuilder
,以便为其parse()
函数提供忽略名称空间的参数。如果你想在第二个 <span>
中提取文本节点,你的索引将偏离一个:
from xml.dom import expatbuilder
def main():
document = expatbuilder.parse('test.xml', False)
node = document.getElementsByTagName('span')[1]
print float(node.firstChild.data)
if __name__ == '__main__':
main()