python LXML添加sub_element到父基于子元素文本函数与类

对不起，这有点冗长，但我希望尽可能详细。

我有以下示例xml文件:

<root>
<input_file>
<type>x</type>
</input_file>
<input_file>
<type>y</type>
</input_file>
</root>

并且希望使用python3.9中的python lxml包添加基于<type>标记的子元素，以便xml文件看起来像这样:

<root>
<input_file>
<type>x</type>
<path>hi</path>
</input_file>
<input_file>
<type>y</type>
<path>hi_again</path>
</input_file>
</root>

下面的代码可以工作:

from lxml import etree as LET
xml_file = 'test.xml'
tree = LET.parse(xml_file)
root = tree.getroot()
for input_file in root.findall('input_file'):
type_element = input_file.find('type')
if type_element.text == 'x':
c = LET.SubElement(input_file, 'path')
c.text = 'hi'
elif type_element.text == 'y':
c = LET.SubElement(input_file, 'path')
c.text = 'hi_again'
LET.indent(root, space="  ")
tree.write(xml_file)

当我尝试从下面创建的类(文件XMLReader.py)执行此操作时:

from lxml import etree as LET
import string
class XMLReader(object):
def __init__(self, file_path):
self.file_path = file_path
self.tree = LET.parse(self.file_path)
self.root = self.tree.getroot()
def set_sub_element(self, parent_tag, tag, info):
child = LET.SubElement(parent_tag, tag)
child.text = self.clean_string(info)
self.root.append(child)
LET.indent(self.root, space="  ")
self.tree.write(self.file_path)
def get_all_elements(self, tag):
try:
return self.root.findall(tag)
except AttributeError:
return None
def clean_string(self, s):
return ''.join(filter(lambda x: x in string.printable, s))

，代码如下:

from XMLReader import XMLReader
items = {'x':'hi', 'y': 'hi_again'}
xml_file = 'test.xml'
xml_test = XMLReader(xml_file)
input_file_tags = xml_test.get_all_elements('input_file')
for input_file in input_file_tags:
type_element = input_file.find('type')
if type_element.text in items:
item = items[type_element.text]
xml_test.set_sub_element(input_file, 'path', item)

我得到以下结果文件:

<root>
<input_file>
<type>x</type>
</input_file>
<input_file>
<type>y</type>
</input_file>
<path>hi</path>
<path>hi_again</path>
</root>

我想知道我在这里做错了什么，以获得与上面相同的结果，其中所得的<path></path>不是基于<type></type>值的<input_file></input_file>的子元素。

作为练习，这里有一种相对简单的方法:

#start with your `items`
items = {'x':'hi', 'y': 'hi_again'}
tags = ['input_file','type']
for v in items.keys():
destination = root.xpath(f'//{tags[0]}/{tags[1]}[text()="{v}"]/..')[0]
new_elem=LET.fromstring(f'<path>{items[v]}</path>')
destination.insert(1,new_elem)
LET.indent(root, space="  ")
print(LET.tostring(root).decode())

输出:

<root>
<input_file>
<type>x</type>
<path>hi</path>
</input_file>
<input_file>
<type>y</type>
<path>hi_again</path>
</input_file>
</root>

回答你在代码中做错了什么:在set_sub_element方法中，首先将新的子元素作为子元素添加到指定的父元素(在代码<input_file/>中)。来自lxml.etree的SubElement工厂创建了一个新元素并将其附加到父元素。这应该足以产生您的输出。

但是，在相同的方法中，您可以将新的子节点附加到根节点。这将元素移动到树中的不同位置(即它现在是根而不是<input_file/>的子元素)。因此，您应该删除将子目录附加到根目录的行。set_sub_element方法就像这样:

def set_sub_element(self, parent_tag, tag, info):
child  = LET.SubElement(parent_tag, tag)
child.text = self.clean_string(info)
LET.indent(self.root, space=" ")
self.tree.write(self.file_path)

相关内容

最新更新

热门标签：