我的任务是将XML表转换为HTML表。问题是XML不遵循HTML约定,我将不得不将节点移动到正确的位置。标题是预先排序的,而不是分层排序的,并且在表的最后一行和结束表标记之间有表注释。
通过使用构建器计算和创建HTML,然后用生成的HTML替换XML表头,我解决了预先顺序到水平顺序的转换问题。但最后一个问题,应该很简单,却让我精神崩溃。我需要将<TNOTE>
从<GPOTABLE>
中移出,并将其放在</GPOTABLE>
之后的<div>
中。
XML数据片段为:
<P>(vi) Grinding wheels or discs for vertical single-spindle disc grinders shall be encircled with hoods to remove the dust generated in the operation. The hoods shall be connected to one or more branch pipes having exhaust volumes as shown in Table D-57.5.</P>
<GPOTABLE CDEF="s15,6,6,6,6" COLS="5" OPTS="L2">
<TTITLE>Table D-57.5—Vertical Spindle Disc Grinder</TTITLE>
<BOXHD>
<CHED H="1">Disc diameter, inches (cm)</CHED>
<CHED H="1">One-half or more of disc covered</CHED>
<CHED H="2">Number <SU>1</SU>
</CHED>
<CHED H="2">Exhaust foot <SU>3</SU>/min.</CHED>
<CHED H="1">Disc not covered</CHED>
<CHED H="2">Number <SU>1</SU>
</CHED>
<CHED H="2">Exhaust foot<SU>3</SU>/min.</CHED>
</BOXHD>
<ROW>
<ENT I="01">Up to 20 (50.8)</ENT>
<ENT>1</ENT>
<ENT>500</ENT>
<ENT>2</ENT>
<ENT>780</ENT>
</ROW>
<!-- ....snip .... -->
<ROW>
<ENT I="01">Over 53 to 72 (134.62 to 182.88)</ENT>
<ENT>2</ENT>
<ENT>3,140</ENT>
<ENT>5</ENT>
<ENT>6,010</ENT>
</ROW>
<TNOTE>
<SU>1</SU> Number of exhaust outlets around periphery of hood, or equal distribution provided by other means.</TNOTE>
</GPOTABLE>
<P>(vii) Grinding and polishing belts shall be provided with hoods to remove dust and dirt generated in the operations and the hoods shall be connected to branch pipes having exhaust volumes as shown in Table D-57.6.</P>
转换为HTML后,它应该看起来像这样:
<table cdef="s15,6,6,6,6" cols="5" opts="L2">
<caption>Table D-57.5—Vertical Spindle Disc Grinder</caption>
<tr>
<th rowspan="2" colspan="1" class="table_header">Disc diameter, inches (cm)</th>
<th rowspan="1" colspan="2" class="table_header">One-half or more of disc covered</th>
<th rowspan="1" colspan="2" class="table_header">Disc not covered</th>
</tr>
<tr>
<th rowspan="1" colspan="1" class="table_header">Number <su>1</su></th>
<th rowspan="1" colspan="1" class="table_header">Exhaust foot <su>3</su>/min.</th>
<th rowspan="1" colspan="1" class="table_header">Number <su>1</su> </th>
<th rowspan="1" colspan="1" class="table_header">Exhaust foot<su>3</su>/min.</th>
</tr>
<tr>
<td i="01">Up to 20 (50.8)</td>
<td>1</td>
<td>500</td>
<td>2</td>
<td>780</td>
</tr>
<!-- .... snip .... -->
<tr>
<td i="01">Over 53 to 72 (134.62 to 182.88)</td>
<td>2</td>
<td>3,140</td>
<td>5</td>
<td>6,010</td>
</tr>
</table>
<div class='tnote'><su>1</su> Number of exhaust outlets around periphery of hood, or equal distribution provided by other means</div>
这是我到目前为止得到的:
def xslt_tables(xml_text)
frag = Nokogiri::HTML(xml_text)
frag.xpath("//gpotable").each do |table|
TableConverter.new(table)
table.name = 'table'
end
frag.inner_html
end
class TableConverter
attr_accessor :data, :rows, :columns, :frag
# Expects a nokogiri object (a single <gpotable> node), not merely an html fragment
def initialize(nokogiri_fragment)
@column_index = 0
@frag = nokogiri_fragment
puts "find table size..."
find_table_size()
puts "populating the grid..."
populate_grid()
puts "computing rowspans and colspans, save in @data..."
compute_rowspans_and_colspans()
puts "assemble headers from @data"
nokogiri_headers = html_headers()
puts "replace the boxhd with nokogiri_headers, translate remaining table entities"
replace_nodes(nokogiri_headers)
end
# .... snip ....
def replace_nodes(headers)
# note: this actually changes values in the original nokogiri object!
# I'll leave it to the calling script to change the name to <table>
# @frag.xpath("//gpotable").each do |table|
# puts "renaming //gpotable"
# table.name = 'table'
# end
@frag.xpath("ttitle").each do |cap|
puts "replacing ttitle with caption"
cap.name = 'caption'
end
@frag.xpath("boxhd").each do |old|
puts "replacing boxhd with generated th with computed rowspan and colspan"
old.replace headers
end
@frag.xpath("row").each do |row|
puts "renaming row to tr"
row.name = 'tr'
end
@frag.xpath("tr/ent").each do |ent|
puts "renaming ent to td"
ent.name = 'td'
end
@frag.xpath("tnote").each do |tfoot|
puts "moving tnote"
tfoot.add_next_sibling('tnote')
end
end
end
显然,带有tnote的最后一个块是错误的,但我对如何将该节点附加到@frag
的末尾感到困惑。
我很感激任何在正确方向上的推动;Nokogiri教程和小抄对我来说没有任何意义。
发帖三个小时后,显而易见的(现在我看到了)答案让我大吃一惊…
@frag.xpath("tnote").each do |tfoot|
puts "moving tnote"
tfoot.parent.add_next_sibling(tfoot).name = 'div'
end