在python3中向XML元素添加文本



我有一个xml文件:

<listOfSpecies>
<species metaid="MAM00001c" sboTerm="SBO:0000247" id="MAM00001c" name="(-)-trans-carveol" compartment="c" initialConcentration="0" hasOnlySubstanceUnits="false" boundaryCondition="false" constant="false" fbc:charge="0" fbc:chemicalFormula="C10H16O">
<annotation>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:vCard="http://www.w3.org/2001/vcard-rdf/3.0#" xmlns:vCard4="http://www.w3.org/2006/vcard/ns#" xmlns:bqbiol="http://biomodels.net/biology-qualifiers/" xmlns:bqmodel="http://biomodels.net/model-qualifiers/">
<rdf:Description rdf:about="#MAM00001c">
<bqbiol:is>
<rdf:Bag>

</rdf:Bag>
</bqbiol:is>
</rdf:Description>
</rdf:RDF>
</annotation>
</species>
...
</listOfSpecies>

和一个文本文件:

name="(-)-trans-carveol" fbc:charge="0" fbc:chemicalFormula="C10H16O"
<rdf:li rdf:resource="https://identifiers.org/kegg.compound/C11409"/>
<rdf:li rdf:resource="https://identifiers.org/pubchem.compound/94221"/>
<rdf:li rdf:resource="https://identifiers.org/lipidmaps/LMPR0102090005"/>
<rdf:li rdf:resource="https://identifiers.org/inchi/InChI=1S/C10H16O/c1-7(2)9-5-4-   8(3)10(11)6-9/h4,9-11H,1,5-6H2,2-3H3/t9-,10+/m0/s1"/>
<rdf:li rdf:resource="https://identifiers.org/inchikey/BAVONGHXFVOKBV-VHSXEESVSA-N"/>
<rdf:li rdf:resource="https://identifiers.org/metanetx.chemical/MNXM45735"/>

,我想在xml文件中的rdf:Bag标签之间为每个物种/名称插入文本文件中的所有'rdf:li rdf:resource'元素。到目前为止,我一直在使用minidom, beautifulsoup, elementree并将xml文件视为常规文件,但到目前为止我还没有找到任何有效的方法。有人能给我指个方向吗?

要将txt文件中的所有rdf:li rdf:resource元素添加到<rdf:Bag>标记下的XML文件中,您可以使用zip()循环遍历两个文件并使用Tag.insert()添加新标记。

这里有一个例子,你需要稍微修改一下从文件中读取标签,而不是从文档字符串中读取:

from bs4 import BeautifulSoup

xml = """
<listOfSpecies>
<species metaid="MAM00001c" sboTerm="SBO:0000247" id="MAM00001c" name="(-)-trans-carveol" compartment="c" initialConcentration="0" hasOnlySubstanceUnits="false" boundaryCondition="false" constant="false" fbc:charge="0" fbc:chemicalFormula="C10H16O">
<annotation>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:vCard="http://www.w3.org/2001/vcard-rdf/3.0#" xmlns:vCard4="http://www.w3.org/2006/vcard/ns#" xmlns:bqbiol="http://biomodels.net/biology-qualifiers/" xmlns:bqmodel="http://biomodels.net/model-qualifiers/">
<rdf:Description rdf:about="#MAM00001c">
<bqbiol:is>
<rdf:Bag>

</rdf:Bag>
</bqbiol:is>
</rdf:Description>
</rdf:RDF>
<rdf:Bag>

</rdf:Bag>
</annotation>
</species>
</listOfSpecies>
"""
txt = """
name="(-)-trans-carveol" fbc:charge="0" fbc:chemicalFormula="C10H16O"
<rdf:li rdf:resource="https://identifiers.org/kegg.compound/C11409"/>
<rdf:li rdf:resource="https://identifiers.org/pubchem.compound/94221"/>
<rdf:li rdf:resource="https://identifiers.org/lipidmaps/LMPR0102090005"/>
<rdf:li rdf:resource="https://identifiers.org/inchi/InChI=1S/C10H16O/c1-7(2)9-5-4-   8(3)10(11)6-9/h4,9-11H,1,5-6H2,2-3H3/t9-,10+/m0/s1"/>
<rdf:li rdf:resource="https://identifiers.org/inchikey/BAVONGHXFVOKBV-VHSXEESVSA-N"/>
<rdf:li rdf:resource="https://identifiers.org/metanetx.chemical/MNXM45735"/>
"""
xml_soup = BeautifulSoup(xml, "lxml")
txt_soup = BeautifulSoup(txt, "lxml")
for resource, bag in zip(txt_soup.find_all("rdf:li"), xml_soup.find_all("rdf:bag")):
bag.insert(0, resource["rdf:resource"])
print(xml_soup.prettify())

输出:

<listofspecies>
<species boundarycondition="false" compartment="c" constant="false" fbc:charge="0" fbc:chemicalformula="C10H16O" hasonlysubstanceunits="false" id="MAM00001c" initialconcentration="0" metaid="MAM00001c" name="(-)-trans-carveol" sboterm="SBO:0000247">
<annotation>
<rdf:rdf xmlns:bqbiol="http://biomodels.net/biology-qualifiers/" xmlns:bqmodel="http://biomodels.net/model-qualifiers/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:vcard="http://www.w3.org/2001/vcard-rdf/3.0#" xmlns:vcard4="http://www.w3.org/2006/vcard/ns#">
<rdf:description rdf:about="#MAM00001c">
<bqbiol:is>
<rdf:bag>
https://identifiers.org/kegg.compound/C11409
</rdf:bag>
</bqbiol:is>
</rdf:description>
</rdf:rdf>
<rdf:bag>
https://identifiers.org/pubchem.compound/94221
</rdf:bag>
</annotation>
</species>
</listofspecies>

最新更新