如何保留XML标签的序列，甚至使用python添加/删除标签

我正在寻找以下python的解决方案。以下是我当前的 xml 文件格式。

<step_1>abc</step_1>
<step_2>efg</step_2>
<step_3>hij</step_3>
<step_4>klm</step_4>

我想在第一个和最后一个之间添加/删除一个标签，并按顺序维护标签的命名。例如：如果我删除<step_2>efg</step_2>那么结果应如下所示

<step_1>abc</step_1>
<step_2>hij</step_2>
<step_3>klm</step_3>

有什么解决方案吗？提前谢谢你。

我检查了XML元素的标签属性可以修改，至少使用LXML。

我的解决方案基于lxml的另一个原因是它包含xpath方法，这是这里需要的。

首先，假设您已经添加或删除了一些step_...元素在你的源树中，还有具有其他名称的元素，现在整棵树包含：

<main>
<xx>
<other>a1</other>
<step_1>abc</step_1>
<step_3>hij</step_3>
<other>a2</other>
<step_4>klm</step_4>
<step_6>xyz</step_6>
</xx>
<yy>
<step_1>abc_2</step_1>
<step_7>xyz_2</step_7>
<step_2>efg_2</step_2>
<other>a3</other>
<step_4>klm_2</step_4>
</yy>
</main>

我从一个文件中阅读了上述内容：

from lxml import etree as et
parser = et.XMLParser(remove_blank_text=True)
tree = et.parse('Input.xml', parser)
root = tree.getroot()

那么这个想法是：

查找每个包含至少一个step_...元素的"父"元素。
循环遍历其子项，其名称从步骤开始。
将其名称(标记(更改为step_+ 连续数字。

使用测试打印输出执行此操作的代码是：

for el in root.xpath(".//*[starts-with(name(), 'step')]/.."):
tg = el.tag
print(f'Parent: {tg:7}')
i = 0
for el2 in el.xpath("*[starts-with(name(), 'step')]"):
i += 1
tg2 = el2.tag
tt = el2.text
if tt is None: tt = ''
newName = f'step_{i}'
print(f'  Child {i}: {tg2:7}  {tt:8} -> {newName}')
el2.tag = newName

它打印：

Parent: xx     
Child 1: step_1   abc      -> step_1
Child 2: step_3   hij      -> step_2
Child 3: step_4   klm      -> step_3
Child 4: step_6   xyz      -> step_4
Parent: yy     
Child 1: step_1   abc_2    -> step_1
Child 2: step_7   xyz_2    -> step_2
Child 3: step_2   efg_2    -> step_3
Child 4: step_4   klm_2    -> step_4

现在打印内容时：

print(et.tostring(root, encoding='unicode', pretty_print=True))

结果是：

<main>
<xx>
<other>a1</other>
<step_1>abc</step_1>
<step_2>hij</step_2>
<other>a2</other>
<step_3>klm</step_3>
<step_4>xyz</step_4>
</xx>
<yy>
<step_1>abc_2</step_1>
<step_2>xyz_2</step_2>
<step_3>efg_2</step_3>
<other>a3</other>
<step_4>klm_2</step_4>
</yy>
</main>

如您所见：

step_...元素已被"重新计算"，从其父元素中的1开始。
所有其他元素都保持其位置和内容。

相关内容

最新更新

热门标签：