使用Python解析XML内部消息列表



我正在尝试解析以下消息,并使用python将它们存储在列表中。如何将消息存储在列表中?

我通过以下代码访问了id和语言,但我需要找到消息。

import xml.etree.ElementTree as ET
tree = ET.parse(filename)
root = tree.getiterator()
print (root[0].attrib)
<?xml version="1.0" encoding="UTF-8" standalone="no"?><author id="1aa8c430-853b-4bbc-b784-df4c88264ccd" lang="en">
<document><![CDATA[@username bahaha! unfortunately i'm not a tshirt person      ]]></document>
<document><![CDATA[Mercy reunion at Olive Garden #traditions :)     ]]></document>
<document><![CDATA[After 3 months of beating my head against a wall over this persuasion, I've finally had my "aha!" moment!! #progress! #improvement!      ]]></document>
</author>

尝试:

import xml.etree.ElementTree as ET
tree = ET.parse("your_file.xml")
out = [d.text for d in tree.iter("document")]
print(out)

打印:

[
"@username bahaha! unfortunately i'm not a tshirt person      ",
"Mercy reunion at Olive Garden #traditions :)     ",
'After 3 months of beating my head against a wall over this persuasion, I've finally had my "aha!" moment!! #progress! #improvement!      ',
]

首先从根元素中获取author标记:

tree = ET.parse(filename)
root = tree.getiterator()
author = root[0]

然后是作者标签属性中的语言:

author['lang']

以及id:

author['id']

如果您正在存储author标记的内容,您可以创建一个字典来存储每个作者的内容:

import xml.etree.ElementTree as ET
tree = ET.parse(filename)
root = tree.getiterator()

迭代root元素并获得author标记

authors = []
for root in root:
author = root.get('author')
authors.append(author)

从author标签获取语言和id

authors = {author['lang']:author['id'] for author in authors}

遍历authors词典并打印出内容

print(authors)

import xml.etree.ElementTree as ET
tree = ET.parse(filename)
root = tree.getiterator()
author = root[0]
lang = root.find('lang')
id = root.find('id')
print(author,lang,id)

最新更新