使用具有多个搜索元素的Nokogiri

在这个XML片段中，我需要替换某些块的UID中的数据。实际文件包含100多个类似的块。

尽管我已经能够基于name="Track (Timeline)"提取子集，但我正在努力通过使用<TrackID>中的数据将该子集减少到我需要的特定块，如果name="Track (TimeLine)"和<TrackID>的文本是0x1200，则将UID设置为xxxx。

我是野村的新手，虽然我写测试脚本，但我并不认为自己是程序员。

<StructuralMetadata key="06.0E.2B.34.02.53.01.01.0D.01.01.01.01.01.3B.00" length="116" name="Track (TimeLine)">
    <EditRate>25/1</EditRate>
    <Origin>0</Origin>
    <Sequence>32-04-25-67-E7-A7-86-4A-9B-28-53-6F-66-74-65-6C</Sequence>
    <TrackID>0x1200</TrackID>
    <TrackName>Softel VBI Data</TrackName>
    <TrackNumber>0x17010101</TrackNumber>
    <UID>34-C1-B9-B9-5F-07-A4-4E-8F-F4-53-6F-66-74-65-6C</UID>
</StructuralMetadata>
<StructuralMetadata key="06.0E.2B.34.02.53.01.01.0D.01.01.01.01.01.3B.00" length="116" name="Track (TimeLine)">
    <EditRate>25/1</EditRate>
    <Origin>0</Origin>
    <Sequence>35-12-2D-86-E6-74-0B-4C-B4-24-53-6F-66-74-65-6C</Sequence>
    <TrackID>0x1300</TrackID>
    <TrackName>Softel VBI Data</TrackName>
    <TrackNumber>0x0</TrackNumber>
    <UID>37-0C-80-34-4C-8D-CE-41-85-F3-53-6F-66-74-65-6C</UID>
</StructuralMetadata>

使用xpath:

//StructuralMetadata

将选择XML中的所有StructuralMetadata元素。开头的双斜杠意味着选择文档中出现的任何节点。

不过，您不需要所有节点，您可以使用谓词过滤所需的节点：

//StructuralMetadata[@name="Track (TimeLine)" and TrackID="0x1200"]

这将选择具有值为Track (TimeLine)的name属性和内容为0x1200的TrackID子元素的所有StructuralMetadata元素。

如果您对UID元素感兴趣，可以进一步细化表达式：

//StructuralMetadata[@name="Track (TimeLine)" and TrackID="0x1200"]/UID

该表达式将匹配作为与上述谓词匹配的StructuralMetadata元素的子元素的所有UID元素。

使用这个：

require 'nokogiri'
# Parse the document, assuming xml_file is a File object containing the XML
doc = Nokogiri::XML(xml_file)
# I'm assuming there is only one element in the document that matches
# the criteria, so I'm using at_xpath
node = doc.at_xpath('//StructuralMetadata[@name="Track (TimeLine)" and TrackID="0x1200"]/UID')
# At this point, doc contains a representation of the xml, and node points to
# the UID node within that representation. We can update the contents of
# this node
node.content = 'XXX'
# Now write out the updated XML. This just writes it to standard output,
# you could write it to a file or elsewhere if needed
puts doc.to_xml

解决这个问题的一个好方法是使用"map reduce"风格的编程，它的工作原理是获取一个大的列表，缩小它的范围，并将其组合成您想要的结果。具体来说，Array#find和Array#select对这类问题非常有用。看看这个例子：

require 'nokogiri'
xml = Nokogiri::XML.parse(File.read "sample.xml")
element = xml.css('StructuralMetadata').find { |item|
  item['name'] == "Track (TimeLine)" and item.css('TrackID').text == "0x1200"
}
puts element.to_xml

这个小程序首先使用CSS选择器来获取文档中的所有<StructuralMetadata>元素。它返回一个数组，我们可以使用Array#find方法将其过滤为我们想要的值。Array#select是它的表亲，它返回所有匹配对象的数组，而不是它碰巧找到的第一个对象。

在块内部，我们有一个测试来检查<StructuralMetadata>标签是否就是我们要查找的标签。然后，它将element.to_xml字符串放入控制台，这样，如果您将其作为命令行脚本运行，您就可以看到它找到了什么。现在您可以找到元素，可以用通常的方式修改它，并保存一个新的XML文件或其他什么文件。

相关内容

最新更新

热门标签：