正在从xml中的元素提取属性

我有一个脚本，它从许多xpath中提取文本和属性。每个条目的数据在提取时都会附加到一个列表中(在进入下一个xpath之前，所有属性后面都跟着文本(，然后将该列表插入到数据帧中。我的问题是，并不是每个条目在每个xpath中都具有相同的属性。因此，例如，所有条目都具有该元素和至少一个相应的属性(颜色((即，但是某些cat元素可能具有并非所有cat元素都具有的附加属性(。当将行插入到数据帧中时，会出现一个问题，因为行的长度与列数不匹配。除非缺少一个属性，否则属性的顺序将保持一致。当一个属性由于不在元素中而被有效跳过时，我需要一种插入空白字符串的方法。

for next_url in next_url_list:
response = urllib.request.urlopen(next_url)
bytes_ = response.read()
root = xml.etree.ElementTree.fromstring(bytes_)
for count in range(0,len(root.findall("./xpath:entry", namespaces=namespaces))):

for xpath in xpaths:
try:
attribs = list(root.findall(xpath,namespaces=namespaces)[count].attrib.keys())

for attrib in attribs:
award.append(root.findall(xpath, namespaces=namespaces)[count].attrib[attrib])

award.append(root.findall(xpath, namespaces=namespaces)[count].text)

except IndexError:
pass

当一个属性因不在元素中而被有效跳过时，我需要一种插入空白字符串的方法。

为每个元素制作一个预期属性的字典，其中值为空字符串。
- ```
{'a1':'','a2':'',...}
```
从元素中提取属性时，更新dictionary值
使用字典构造缺少行的属性将有空字符串作为值

相关内容

最新更新

热门标签：