尝试使用nokogiri选择css值为"WhoIsOnDutyTableLevel1:header:2"的节点时收到错误。我想野村就是不能处理两个冒号。我有什么选择?我无法更改html的结构。
这是html:
<html lang="en"><head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
<link rel="stylesheet" type="text/css" href="stylesheets/forms.css">
<style type="text/css" media="screen" title="AlarmPoint">
@import "stylesheets/AlarmPoint.css";
</style>
</head>
<body><table id="WhoIsOnDutyTableLevel1" class="duty-report-level1">
<caption></caption>
<thead>
<tr>
<th id="WhoIsOnDutyTableLevel1:header:1" class="duty-report-lt-header">Who's on duty for:
January 06, 2012 00:00 -0800</th>
</tr>
</thead>
<tfoot></tfoot>
<tbody>
<tr>
<td headers="WhoIsOnDutyTableLevel1:header:1">
<table id="WhoIsOnDutyTableLevel2" class="duty-report-level2">
<caption></caption>
<thead>
<tr>
<th id="WhoIsOnDutyTableLevel1:header:1">Group Name</th><th id="WhoIsOnDutyTableLevel1:header:2">Group Time Zone</th><th id="WhoIsOnDutyTableLevel1:header:3">Default Devices</th><th id="WhoIsOnDutyTableLevel1:header:4">Supervisors</th>
</tr>
</thead>
<tfoot></tfoot>
<tbody>
<tr>
<td headers="WhoIsOnDutyTableLevel1:header:1"><a id="Support" href="/alarmpoint/GroupDetails.do;jsessionid=7pj06dj6krfs?_data=TJZuNquzHUiPj8lqyud7XvypLHHjUnT5bHn7hwtTf9Ei0C2PJ8QYcKIy8OkorCWT8HDTAzkon1ls%0D%0AefuHC1N%2F0SLQLY8nxBhwesdd7Zeg6NzvCfuzRqLg5g%3D%3D" name="Support" class="details">Support</a></td><td headers="WhoIsOnDutyTableLevel1:header:2" class="centered-text">US/Pacific</td><td headers="WhoIsOnDutyTableLevel1:header:3" class="centered-text"><img border="0" src="/static/images/icon_boolean_false.png" alt="No"></td><td headers="WhoIsOnDutyTableLevel1:header:4">
<values>
</values><a id="mgr1" href="/alarmpoint/UserDevices.do;jsessionid=7pj06dj6krfs?_data=KpBkJeR08z4qZ%2FFdHH4hUAixAYz%2BqH6ZPkanPQ24VqQFpjRFPQiWigQHttJBTMFaCLEBjP6ofpk%2B%0D%0ARqc9DbhWpI1nHAqm8ex%2BxOmu7xYUNxRSU0XUo1xoRw%3D%3D" name="mgr1" class="details">Manager 1</a>
</td>
</tr>
<tr>
<td headers="WhoIsOnDutyTableLevel1:header:1" colspan="4" class="no-padding">
<table id="WhoIsOnDutyTableLevel3" class="duty-report-level3">
<caption></caption>
<thead>
<tr>
<th id="WhoIsOnDutyTableLevel1:header:1" class="th-left">Blah_Blah1</th><th id="WhoIsOnDutyTableLevel1:header:2" class="">08:00 - 17:00 TU WE TH FR </th>
</tr>
</thead>
<tfoot></tfoot>
<tbody>
<tr>
<td headers="WhoIsOnDutyTableLevel1:header:1" colspan="2" class="no-padding">
<table id="WhoIsOnDutyTableLevel4" class="duty-report-level4">
<caption></caption>
<thead>
<tr>
<th id="WhoIsOnDutyTableLevel1:header:1">Recipient</th><th id="WhoIsOnDutyTableLevel1:header:2">Category</th><th id="WhoIsOnDutyTableLevel1:header:3">Escalation</th>
</tr>
</thead>
<tfoot></tfoot>
<tbody>
<tr id="205414">
<td headers="WhoIsOnDutyTableLevel1:header:1"><a id="user0" href="/alarmpoint/UserDevices.do;jsessionid=7pj06dj6krfs?_data=KpBkJeR08z6oL%2BaI47zI4gixAYz%2BqH6ZPkanPQ24VqQFpjRFPQiWigQHttJBTMFaCLEBjP6ofpk%2B%0D%0ARqc9DbhWpI1nHAqm8ex%2BxOmu7xYUNxRSU0XUo1xoRw%3D%3D" name="user0" class="details">User 0</a></td><td headers="WhoIsOnDutyTableLevel1:header:2">PERSON</td><td headers="WhoIsOnDutyTableLevel1:header:3">0</td>
</tr>
<tr id="207569">
<td headers="WhoIsOnDutyTableLevel1:header:1"><a id="user1" href="/alarmpoint/UserDevices.do;jsessionid=7pj06dj6krfs?_data=KpBkJeR08z4qZ%2FFdHH4hUAixAYz%2BqH6ZPkanPQ24VqQFpjRFPQiWigQHttJBTMFaCLEBjP6ofpk%2B%0D%0ARqc9DbhWpI1nHAqm8ex%2BxOmu7xYUNxRSU0XUo1xoRw%3D%3D" name="user1" class="details">User 1</a></td><td headers="WhoIsOnDutyTableLevel1:header:2">PERSON</td><td headers="WhoIsOnDutyTableLevel1:header:3">10</td>
</tr>
<tr id="209107">
<td headers="WhoIsOnDutyTableLevel1:header:1"><a id="user2" href="/alarmpoint/UserDevices.do;jsessionid=7pj06dj6krfs?_data=KpBkJeR08z6uKpyoDh%2Fz%2FQixAYz%2BqH6ZPkanPQ24VqQFpjRFPQiWigQHttJBTMFaCLEBjP6ofpk%2B%0D%0ARqc9DbhWpI1nHAqm8ex%2BxOmu7xYUNxRSU0XUo1xoRw%3D%3D" name="uer2" class="details">User 2</a></td><td headers="WhoIsOnDutyTableLevel1:header:2">PERSON</td><td headers="WhoIsOnDutyTableLevel1:header:3">25</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
<tr>
<td headers="WhoIsOnDutyTableLevel1:header:1" colspan="4" class="no-padding">
<table id="WhoIsOnDutyTableLevel3" class="duty-report-level3">
<caption></caption>
<thead>
<tr>
<th id="WhoIsOnDutyTableLevel1:header:1" class="th-left">Blah_Blah</th><th id="WhoIsOnDutyTableLevel1:header:2" class="">17:00 Lasting 15:00 WE TH FR </th>
</tr>
</thead>
<tfoot></tfoot>
<tbody>
<tr>
<td headers="WhoIsOnDutyTableLevel1:header:1" colspan="2" class="no-padding">
<table id="WhoIsOnDutyTableLevel4" class="duty-report-level4">
<caption></caption>
<thead>
<tr>
<th id="WhoIsOnDutyTableLevel1:header:1">Recipient</th><th id="WhoIsOnDutyTableLevel1:header:2">Category</th><th id="WhoIsOnDutyTableLevel1:header:3">Escalation</th>
</tr>
</thead>
<tfoot></tfoot>
<tbody>
<tr id="210855">
<td headers="WhoIsOnDutyTableLevel1:header:1"><a id="user0" href="/alarmpoint/UserDevices.do;jsessionid=7pj06dj6krfs?_data=KpBkJeR08z72pjQodq7P5gixAYz%2BqH6ZPkanPQ24VqQFpjRFPQiWigQHttJBTMFaCLEBjP6ofpk%2B%0D%0ARqc9DbhWpI1nHAqm8ex%2BxOmu7xYUNxRSU0XUo1xoRw%3D%3D" name="user0" class="details">User 0</a></td><td headers="WhoIsOnDutyTableLevel1:header:2">PERSON</td><td headers="WhoIsOnDutyTableLevel1:header:3">5</td>
</tr>
<tr id="210529">
<td headers="WhoIsOnDutyTableLevel1:header:1"><a id="user1" href="/alarmpoint/UserDevices.do;jsessionid=7pj06dj6krfs?_data=KpBkJeR08z6r1%2Fmnw2SZ2AixAYz%2BqH6ZPkanPQ24VqQFpjRFPQiWigQHttJBTMFaCLEBjP6ofpk%2B%0D%0ARqc9DbhWpI1nHAqm8ex%2BxOmu7xYUNxRSU0XUo1xoRw%3D%3D" name="user1" class="details">User 1</a></td><td headers="WhoIsOnDutyTableLevel1:header:2">PERSON</td><td headers="WhoIsOnDutyTableLevel1:header:3">10</td>
</tr>
<tr id="210337">
<td headers="WhoIsOnDutyTableLevel1:header:1"><a id="user2" href="/alarmpoint/UserDevices.do;jsessionid=7pj06dj6krfs?_data=KpBkJeR08z6Rwqu8vCtzBAixAYz%2BqH6ZPkanPQ24VqQFpjRFPQiWigQHttJBTMFaCLEBjP6ofpk%2B%0D%0ARqc9DbhWpI1nHAqm8ex%2BxOmu7xYUNxRSU0XUo1xoRw%3D%3D" name="user2" class="details">User 2</a></td><td headers="WhoIsOnDutyTableLevel1:header:2">PERSON</td><td headers="WhoIsOnDutyTableLevel1:header:3">15</td>
</tr>
<tr id="204675">
<td headers="WhoIsOnDutyTableLevel1:header:1"><a id="user3" href="/alarmpoint/UserDevices.do;jsessionid=7pj06dj6krfs?_data=KpBkJeR08z5oj5jdRJbzggixAYz%2BqH6ZPkanPQ24VqQFpjRFPQiWigQHttJBTMFaCLEBjP6ofpk%2B%0D%0ARqc9DbhWpI1nHAqm8ex%2BxOmu7xYUNxRSU0XUo1xoRw%3D%3D" name="user3" class="details">User 3</a></td><td headers="WhoIsOnDutyTableLevel1:header:2">PERSON</td><td headers="WhoIsOnDutyTableLevel1:header:3">20</td>
</tr>
<tr id="205555">
<td headers="WhoIsOnDutyTableLevel1:header:1"><a id="user4" href="/alarmpoint/UserDevices.do;jsessionid=7pj06dj6krfs?_data=KpBkJeR08z4G8e7%2FY9SHyQixAYz%2BqH6ZPkanPQ24VqQFpjRFPQiWigQHttJBTMFaCLEBjP6ofpk%2B%0D%0ARqc9DbhWpI1nHAqm8ex%2BxOmu7xYUNxRSU0XUo1xoRw%3D%3D" name="user4" class="details">User 4</a></td><td headers="WhoIsOnDutyTableLevel1:header:2">PERSON</td><td headers="WhoIsOnDutyTableLevel1:header:3">25</td>
</tr>
<tr id="205004">
<td headers="WhoIsOnDutyTableLevel1:header:1"><a id="user5" href="/alarmpoint/UserDevices.do;jsessionid=7pj06dj6krfs?_data=KpBkJeR08z5XHkcVAMfXqgixAYz%2BqH6ZPkanPQ24VqQFpjRFPQiWigQHttJBTMFaCLEBjP6ofpk%2B%0D%0ARqc9DbhWpI1nHAqm8ex%2BxOmu7xYUNxRSU0XUo1xoRw%3D%3D" name="user5" class="details">User 5</a></td><td headers="WhoIsOnDutyTableLevel1:header:2">PERSON</td><td headers="WhoIsOnDutyTableLevel1:header:3">30</td>
</tr>
<tr id="204718">
<td headers="WhoIsOnDutyTableLevel1:header:1"><a id="user6" href="/alarmpoint/GroupDetails.do;jsessionid=7pj06dj6krfs?_data=TJZuNquzHUiUNEK29yovHscXndexl2jCbHn7hwtTf9Ei0C2PJ8QYcKIy8OkorCWT8HDTAzkon1ls%0D%0AefuHC1N%2F0SLQLY8nxBhwesdd7Zeg6NzvCfuzRqLg5g%3D%3D" name="user6" class="details">User 6</a></td><td headers="WhoIsOnDutyTableLevel1:header:2">GROUP</td><td headers="WhoIsOnDutyTableLevel1:header:3">35</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</body></html>
我在我的rails控制台上运行这个:
>> onDuty_userList = doc.at_xpath('//*[@id="WhoIsOnDutyTableLevel4"]/tbody/tr')
=> #<Nokogiri::XML::Element:0x1b83804 name="tr" attributes=[#<Nokogiri::XML::Attr:0x1b83764 name="id" value="208894">] children=[#<Nokogiri::XML::Element:0x1b83174 name="td" attributes=[#<Nokogiri::XML::Attr:0x1b83110 name="headers" value="WhoIsOnDutyTableLevel1:header:1">] children=[#<Nokogiri::XML::Element:0x1b82b70 name="a" attributes=[#<Nokogiri::XML::Attr:0x1b82aa8 name="href" value="/alarmpoint/UserDevices.do;jsessionid=3pz7t91kle3?_data=KpBkJeR08z6mdgIY4sPrzAixAYz%2BqH6ZPkanPQ24VqQFpjRFPQiWigQHttJBTMFaCLEBjP6ofpk%2B%0D%0ARqc9DbhWpI1nHAqm8ex%2BxOmu7xYUNxRSU0XUo1xoRw%3D%3D">, #<Nokogiri::XML::Attr:0x1b82a94 name="name" value="xxx">, #<Nokogiri::XML::Attr:0x1b82a80 name="id" value="xxx">, #<Nokogiri::XML::Attr:0x1b82a6c name="class" value="details">] children=[#<Nokogiri::XML::Text:0x1b7b438 "rn xxx, xxx (xxx)rn ">]>]>, #<Nokogiri::XML::Element:0x1b7b104 name="td" attributes=[#<Nokogiri::XML::Attr:0x1b7b0a0 name="headers" value="WhoIsOnDutyTableLevel1:header:2">] children=[#<Nokogiri::XML::Text:0x1b7aba0 "PERSON">]>, #<Nokogiri::XML::Element:0x1b7a984 name="td" attributes=[#<Nokogiri::XML::Attr:0x1b7a90c name="headers" value="WhoIsOnDutyTableLevel1:header:3">] children=[#<Nokogiri::XML::Text:0x1b7a420 "0">]>, #<Nokogiri::XML::Text:0x1b7a1f0 "rn">]>
当我尝试用css值进行选择时:
>> onDuty_userList.css('WhoIsOnDutyTableLevel1:header:2')
Nokogiri::CSS::SyntaxError: unexpected '2' after ':'
from /home/dan/.rvm/gems/ruby-1.9.2-p290@apraper/gems/nokogiri-1.5.0/lib/nokogiri/css/parser_extras.rb:87:in `on_error'
from /home/dan/.rvm/rubies/ruby-1.9.2-p290/lib/ruby/1.9.1/racc/parser.rb:99:in `_racc_do_parse_c'
from /home/dan/.rvm/rubies/ruby-1.9.2-p290/lib/ruby/1.9.1/racc/parser.rb:99:in `do_parse'
from /home/dan/.rvm/gems/ruby-1.9.2-p290@apraper/gems/nokogiri-1.5.0/lib/nokogiri/css/parser_extras.rb:62:in `parse'
from /home/dan/.rvm/gems/ruby-1.9.2-p290@apraper/gems/nokogiri-1.5.0/lib/nokogiri/css/parser_extras.rb:79:in `xpath_for'
from /home/dan/.rvm/gems/ruby-1.9.2-p290@apraper/gems/nokogiri-1.5.0/lib/nokogiri/css.rb:23:in `xpath_for'
from /home/dan/.rvm/gems/ruby-1.9.2-p290@apraper/gems/nokogiri-1.5.0/lib/nokogiri/xml/node.rb:211:in `block in css'
from /home/dan/.rvm/gems/ruby-1.9.2-p290@apraper/gems/nokogiri-1.5.0/lib/nokogiri/xml/node.rb:210:in `map'
from /home/dan/.rvm/gems/ruby-1.9.2-p290@apraper/gems/nokogiri-1.5.0/lib/nokogiri/xml/node.rb:210:in `css'
from (irb):129
from /home/dan/.rvm/rubies/ruby-1.9.2-p290/bin/irb:16:in `<main>'
您需要在id前面加一个#。
这应该工作:
doc.css('#WhoIsOnDutyTableLevel1:header:2')