最后几天,我正在寻找任何解决方案,以获得多个节点使用Nokogiri受制于参考变量在一个祖先节点。
我需要什么:实际上,我正在收集"Segment"节点的所有"Id"。然后我想收集"段"节点内的所有后续"资源"。为了收集"资源",我想将"Id"设置为一个变量。
<CPL>
<SegmL>
<Segment>
<Id>UUID</Id> #UUID as a variable
<Name>name_01</Name>
<SeqL>
<ImageSequence>
<Id>UUID</Id>
<Track>UUID</Track>
<ResourceList>
<Resource> #depending on SegmentId
<A>aaa</A>
<B>bbb</B>
<C>ccc</C>
<D>ddd</D>
</Resource>
</ResourceList>
</ImageSequence>
<AudioSequence>
<Id>UUID</Id>
<Track>UUID</Track>
<ResourceList>
<Resource>
<A>aaa</A>
<B>bbb</B>
<C>ccc</C>
<D>ddd</D>
</Resource>
</ResourceList>
</AudioSequence>
</SequL>
</Segment>
<Segment>
<Id>UUIDa</Id>
<Name>name_02</Name>
<SequL>
<ImageSequence>
<Id>UUID</Id>
<Track>UUID</Track>
<ResourceList>
<Resource>
<A>aaa</A>
<B>bbb</B>
<C>ccc</C>
<D>ddd</D>
</Resource>
</ResourceList>
</ImageSequence>
<AudioSequence>
<Id>UUID</Id>
<Track>UUID</Track>
<ResourceList>
<Resource>
<A>aaa</A>
<B>bbb</B>
<C>ccc</C>
<D>ddd</D>
</Resource>
</ResourceList>
</AudioSequence>
</SequL>
</Segment>
</SegmL>
</CPL>
A = Resource.css("A").text.gsub(/n/,"")
收集的所有资源数据
#first each do
cpls.each_with_index do |(cpl_uuid, mycpl), index|
cpl_filename = mycpl
cpl_file = File.open("#{resource_uri}/#{cpl_filename}")
cpl = Nokogiri::XML( cpl_file ).remove_namespaces!
#get UUID for UUID checks
cpl_uuid = cpl.css("Id").first.text.gsub(/n/,"")
cpl_root_edit_rate = cpl.css("EditRate").first.text.gsub(/s+/, "/")
#second each do
cpl.css("Segment").each do |s| # loop segment
cpl_segment_list_uuid = s.css("Id").first.text.gsub(/n/,"") #uuid of segment list
#third each do
cpl.css("Resource").each do |f| #loop resources
cpl_A = f.css("A").text.gsub(/n/,"") # uuid of A
cpl_B = f.css("B").text.gsub(/n/,"") # uuid of B
end #third
end #second
end #first
我的表达式将这些信息存储在一个数组中:
A = 48000.0
B = 240000.0
C = 0.0
D = 240000.0
Some functions to calculate an average on the resources.
puts all_arry
A = 5.0
B = 5.0
C = 5.0
D = 5.0
A = 5.0
B = 5.0
C = 5.0
D = 5.0
=8 values -> only 4 values existing for the exact loop (2 average values per Segment)
此刻所有的"SegmentId"正在收集所有的"Resource"s
我如何准确地分配每个段Id作为变量的后续资源?
我已经使用了这个代码,但是循环是空的,认为因为在"段"的"Id"和每个"资源"A","B"之间有一些更多的节点……:
if cpl.at("Segment/Id:contains("#{cpl_segment_list_uuid}")")
cpl.css("Resource").each do |f|
#collecting resources here for each segmet
end
end
所有节点都没有属性、id、类等。
希望你能帮我解决这个问题。首先,我将礼貌地感谢您的支持!
更新10/07/16
我还使用以下表达式运行了资源上的"each do"代码:
expression = "/SegmetList/Segment[Id>cpl_segment_list_uuid]"
cpl.xpath(expression).each do |f|
它运行"each do",但我没有得到内部节点
cpl.css("Segment:contains("#{cpl_segment_list_uuid}") > Resource").each do |f|
与上一个
相同和"if"条件,也是同样的问题:
if cpl.at("Segment/Id:contains("#{cpl_segment_list_uuid}")").each do|f|
#some code
end
更新2016/18/10
实际上我得到了资源的正确数量(4),但仍然没有为每个Segment分开。所以在每个片段中有相同的四个资源。
为什么我没有得到所有资源的双数字,因为我在"Segment"循环中创建了数组。
这是当前代码:
#first each do
cpls.each_with_index do |(cpl_uuid, mycpl), index|
cpl_filename = mycpl
cpl_file = File.open("#{resource_uri}/#{cpl_filename}")
cpl = Nokogiri::XML( cpl_file ).remove_namespaces!
#get UUID for UUID checks
cpl_uuid = cpl.css("Id").first.text.gsub(/n/,"")
cpl_root_edit_rate = cpl.css("EditRate").first.text.gsub(/s+/, "/")
#second each do
cpl.css("Segment").each do |s| # loop segment
cpl_segment_list_uuid = s.css("Id").first.text.gsub(/n/,"") #uuid of segment list
array_for_resource_data = Array.new
#third each do
s.css("Resource").each do |f| #loop resources #all resources
s.search('//A | //B').each do |f| #selecting only resources "A" and "B"
cpl_A = f.css("A").text.gsub(/n/,"") # uuid of A
cpl_B = f.css("B").text.gsub(/n/,"") # uuid of B
end #third
end #second
end #first
我希望我的更新能给你更多的细节。非常感谢您的帮助和回答!更新2016/31/10
段双输出的问题已经解决。现在我在片段下面的每个序列上又多了一个循环:
cpl.css("Segment").each do |u|
segment_list_uuid = u.css("Id").first.text.gsub(/n/,"")
sequence_list_uuid_arr = Array.new
u.xpath("//SequenceList[//*[starts-with(name(),'Sequence')]]").each do |s|
sequence_list_uuid = s.css("TrackId").first.text#.gsub(/n/,"")
sequence_list_uuid_arr.push(cpl_sequence_list_uuid)
#following some resource nodes
s.css("Resource").each do |f|
asset_uuid = f.css("TrackFileId").text.gsub(/n/,"")
resource_uuid = f.css("Id").text.gsub(/n/,"")
edit_rate = f.css("EditRate").text.gsub(/s+/, "/")
#some more code
end #resource
end #sequence list
end #segment
现在我想获得每个唯一序列下所有不同的"资源"。我必须列出所有不同的资源,并总结一些收集到的价值。
是否有办法在相同的"序列id"下收集具有不同值(子节点)的每个资源?目前,我没有任何解决方案....的想法所以我没有代码可以给你看,它可以部分地工作。
each_with_index for "Resource"循环不起作用。
你是否有一些想法或任何方法来帮助我解决我的新问题?
Try
resource.search('.//A | .//B')
.//
将把xpath查询锚定在当前元素上,而不是搜索整个文档。
elem = doc.search('ImageSequence').first
elem.search('//A') # returns all A in the whole document
elem.search('.//A') # returns all A inside element
这是拆分XML时常见的问题。编写类似于XML中数据布局的代码,允许重复类似数据块。
例如:require 'nokogiri'
cpl = Nokogiri::XML(<<EOT)
<CPL>
<SegmL>
<Segment>
<Id>UUID</Id> #UUID as a variable
<Name>name_01</Name>
<SeqL>
<ImageSequence>
<Id>UUID</Id>
<Track>UUID</Track>
<ResourceList>
<Resource> #depending on SegmentId
<A>aaa</A>
<B>bbb</B>
<C>ccc</C>
<D>ddd</D>
</Resource>
</ResourceList>
</ImageSequence>
<AudioSequence>
<Id>UUID</Id>
<Track>UUID</Track>
<ResourceList>
<Resource>
<A>aaa</A>
<B>bbb</B>
<C>ccc</C>
<D>ddd</D>
</Resource>
</ResourceList>
</AudioSequence>
</SequL>
</Segment>
</SegmL>
</CPL>
EOT
首先查找包含要遍历的数据的节点,然后开始下降到该数据:
data = cpl.search('Segment').each_with_object([]) { |segment, ary|
hash = {}
hash[:id] = segment.at('Id').text
hash[:name] = segment.at('Name').text
image_sequence = segment.at('ImageSequence')
image_sequence_h = {}
image_sequence_h[:id] = image_sequence.at('Id').text
image_sequence_h[:track] = image_sequence.at('Track').text
image_resources_h = {
a: image_sequence.at('A').text,
b: image_sequence.at('B').text,
c: image_sequence.at('C').text,
d: image_sequence.at('D').text,
}
audio_sequence = segment.at('AudioSequence')
audio_sequence_h = {}
audio_sequence_h[:id] = audio_sequence.at('Id').text
audio_sequence_h[:track] = audio_sequence.at('Track').text
audio_resources_h = {
a: audio_sequence.at('A').text,
b: audio_sequence.at('B').text,
c: audio_sequence.at('C').text,
d: audio_sequence.at('D').text,
}
image_sequence_h[:resources] = image_resources_h
audio_sequence_h[:resources] = audio_resources_h
hash[:image_sequence] = image_sequence_h
hash[:audio_sequence] = audio_sequence_h
ary << hash
}
这比我通常写的更冗长,因为我想让步骤更清楚。
最终结果是一个哈希数组:
# => [{:id=>"UUID",
# :name=>"name_01",
# :image_sequence=>
# {:id=>"UUID",
# :track=>"UUID",
# :resources=>{:a=>"aaa", :b=>"bbb", :c=>"ccc", :d=>"ddd"}},
# :audio_sequence=>
# {:id=>"UUID",
# :track=>"UUID",
# :resources=>{:a=>"aaa", :b=>"bbb", :c=>"ccc", :d=>"ddd"}}}]
然后很容易遍历数组并访问单个数据块或数据的单个元素:
data[0][:image_sequence][:id] # => "UUID"
data[0][:audio_sequence][:resources][:d] # => "ddd"