XML Python使用ElementTree从众多属性中选择一个

据我所知，这个问题不会重复，因为我已经寻找解决方案好几天了，根本无法确定问题。我正在尝试使用Python打印XML文档标记中的嵌套属性。我相信我遇到的错误与我试图从中获取信息的标签有多个属性有关。有没有什么方法可以指定我想要"第二个标记"标记中的"状态"值？？非常感谢你的帮助。

我的XML文档"test.XML"：

<?xml version="1.0" encoding="UTF-8"?>
<first-tag xmlns="http://somewebsite.com/" date-produced="20130703" lang="en" produced-   by="steve" status="OFFLINE">
    <second-tag country="US" id="3651653" lang="en" status="ONLINE">
    </second-tag>
</first-tag>

我的Python文件：

import xml.etree.ElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot()
whatiwant = root.find('second-tag').get('status')
print whatiwant

错误：

AttributeError: 'NoneType' object has no attribute 'get'

您在.find（"second-tag"）失败，而不是在.get.上失败

为了你想要的，为了你的习语，BeautifulSoup大放异彩。

from BeautifulSoup import BeautifulStoneSoup
soup = BeautifulStoneSoup(xml_string)
whatyouwant = soup.find('second-tag')['status']

我不知道使用elementtree，但我会使用ehp或easyhtmlparser这是链接。http://easyhtmlparser.sourceforge.net/一个朋友告诉我这个工具，我还在学习，它很好，很简单。

from ehp import *
data = '''<?xml version="1.0" encoding="UTF-8"?>
<first-tag xmlns="http://somewebsite.com/" date-produced="20130703" lang="en" produced-   by="steve" status="OFFLINE">
    <second-tag country="US" id="3651653" lang="en" status="ONLINE">
    </second-tag>
</first-tag>'''
html  = Html()
dom   = html.feed(data)
item = dom.fst('second-tag')
value = item.attr['status']
print value

这里的问题是没有名为second-tag的标记。有一个名为{http://somewebsite.com/}second-tag的标签。

你可以很容易地看到：

>>> print(root.getchildren())
[<Element '{http://somewebsite.com/}second-tag' at 0x105b24190>]

不符合名称空间的XML解析器可能会做错误的事情并忽略这一点，从而使代码正常工作。实际上，当您请求second-tag时，向后弯曲以友好的解析器（如BeautifulSoup）将自动尝试{http://somewebsite.com/}second-tag。但ElementTree两者都不是。

如果这还不是你需要知道的全部，你首先需要阅读一篇关于名称空间的教程（也许是这篇）。

相关内容

最新更新

热门标签：