我有一个小的python脚本,它正在读取几个. xml文件。现在我必须断言这些. xml文件没有以任何方式损坏。我怎么检查这个?我的做法是:
xml_tree = ET.parse(path) //path = path to .xml
xml_file = xml_tree.getroot()
如果XML文件损坏,ET.parse()
将引发ParseError
异常:
>>> print open('test.xml').read()
This is not an XML file
>>> from xml.etree import ElementTree as ET
>>> ET.parse('test.xml')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/mj/Development/Library/buildout.python/parts/opt/lib/python2.7/xml/etree/ElementTree.py", line 1182, in parse
tree.parse(source, parser)
File "/Users/mj/Development/Library/buildout.python/parts/opt/lib/python2.7/xml/etree/ElementTree.py", line 656, in parse
parser.feed(data)
File "/Users/mj/Development/Library/buildout.python/parts/opt/lib/python2.7/xml/etree/ElementTree.py", line 1642, in feed
self._raiseerror(v)
File "/Users/mj/Development/Library/buildout.python/parts/opt/lib/python2.7/xml/etree/ElementTree.py", line 1506, in _raiseerror
raise err
xml.etree.ElementTree.ParseError: syntax error: line 1, column 0
简单地捕获异常:
try:
ET.parse(path)
except ET.ParseError:
print('{} is corrupt'.format(path))