这是Chrome中"inspect element"的XPATH:
//*[@id="configparse_port_list"]
这是我用来访问表格的Nokogiri CSS选择器:
doc.css("#configparse_port_list")
但我得到的只是一个空数组。
我做错了什么?
这也不起作用:
doc.css('table[@id="configparse_port_list"]')
.HTML:
<!DOCTYPE html>
<html>
<head>
<title>SIAM</title>
<link href="/assets/application-49cce08127ac99204d4cb59e3bfaab8e.css" media="all" rel="stylesheet" type="text/css" />
<script src="/assets/application-50259c7e8f6a002b7166ab714e68857b.js" type="text/javascript"></script>
<script src="/assets/controllers/configparse_ports-925b92a6e41f7ffc3014e351d29291fc.js" type="text/javascript"></script>
<meta content="authenticity_token" name="csrf-param" />
<meta content="FFh3mbfqnLZhWclBmQ/kEeYSJPeQvapaC0tK9f4wWH8=" name="csrf-token" />
</head>
<body class="configparse_ports_index ctrl_configparse_ports" data-controller="configparse_ports" data-action="index">
<div id="header">
<a href="https://siam-pro.qa.domain.com/"><img alt="domain_logo" src="/assets/domain_logo-0e44a80f1d9f1f9ce8fb7aa35dbc008b.png" /></a>
<div>
<div class="product_name">SIAM</div>
<div class="version">v5.1</div>
</div>
<form accept-charset="UTF-8" action="/search/quick.json" class="ignoreDirty" data-remote="true" id="quick_search" method="post"><div style="margin:0;padding:0;display:inline "><input name="utf8" type="hidden" value="✓" /><input name="authenticity_token" type="hidden" value="FFh3mbfqnLZhWclBmQ/kEeYSJPeQvapaC0tK9f4wWH8=" /></div>
<input id="search_testcases" name="search[testcases]" type="hidden" value="true" />
<input id="search_testplans" name="search[testplans]" type="hidden" value="true" />
<input id="search_component_names" name="search[component_names]" type="hidden" value="true" />
<input autocomplete="off" id="search_term" name="search[term]" placeholder="Search" type="text" />
</form>
<ul class="menu">
<li><a href="https://siam-pro.qa.domain.com/">Home</a></li>
<li><a href="/settings">Settings</a></li>
</ul>
</div>
<div id="wrapper">
<div id="content">
<div id="loading">Loading ...</div>
<div id="flash">
</div>
<div id="warning_message"></div>
<h1>Listing Configparse Ports</h1>
<div id="configparse_port_filters" class="filter_wrap">
<h4>Filter </h4>
</div>
<table id="configparse_port_list">
<thead>
<tr>
<th>ID #</th>
<th>Name</th>
<th>ANI Release</th>
<th>Network Configuration</th>
<th>State</th>
</tr>
</thead>
<tbody>
<tr>
#MANY TRS - one of which I'm looking for based on the 3rd td (ANI Release)
</tr>
</tbody>
</table>
</div>
</div>
<div id="sidebar">
<h3>Testcases</h3>
<ul>
<li><a href="/testcases/new">New</a></li>
<li><a href="/search/testcase/new">Search</a></li>
<li><a href="/search/bugzilla_cr/new">Import RTC</a></li>
</ul>
<h3>Testplans</h3>
<ul>
<li><a href="/testplans/new">New</a></li>
<li><a href="/search/testplan/new">Search</a></li>
<li><a href="/testplans">List Active</a></li>
</ul>
<h3>Use Cases</h3>
<ul>
<li><a href="/use_cases/new">New</a></li>
<li><a href="/search/use_case/new">Search</a></li>
<li><a href="/use_cases/manage">Manage</a></li>
</ul>
<h3>Configparse</h3>
<ul>
<li><a href="/configparse_ports/new">New</a></li>
<li><a href="/configparse_ports">List Ports</a></li>
</ul>
<h3>Automation</h3>
<ul>
<li><a href="/automation_suites/new">New</a></li>
<li><a href="/search/automation_suite/new">Search</a></li>
<li><a href="/automation/status">Status</a></li>
</ul>
</div>
<div id="footer">
<div>
<ul class="menu">
<li><a href="mailto:siam-help@domain.com">Email SIAM Support</a></li>
<li><a href="http://agora.domain.com/wiki/SIAM">SIAM WIKI</a></li>
</ul>
<div class="copyright">© 2012 domain Technologies</div>
</div>
</div>
<script id="quick_search_results_template" type="text/html">
<div>
{{#resources}}
<div class="search_result search_result_{{internal_name}}">
<h4>{{name}}</h4>
{{#count}}
<table>
<thead>
<tr>
<th>ID</th>
<th></th>
</tr>
</thead>
<tbody>
{{#results}}
<tr class="search_result_{{id}}">
<td><a href="{{url}}">{{id}}</a></td>
<td class="search_result_name"><a title="{{name}}" href="{{url}}">{{name}}</a></td>
</tr>
{{/results}}
</tbody>
</table>
<a class='more_results' href="{{search_url}}">More results</a>
{{/count}}
{{^results}}
<div class='no_results'>No matches found</div>
{{/results}}
</div>
{{/resources}}
</div>
</script>
<script type="text/html" id="warning_message_template">
<div class="ui-widget" id="warning_message">
<div class="ui-state-highlight ui-corner-all">
<span class="ui-icon ui-icon-info"></span>
<p>{{message}}</p>
</div>
</div>
</script>
<!-- notification template -->
<div id="notifcation-container" style="display:none">
<div id="basic-template">
<a class="ui-notify-cross ui-notify-close" href="#">x</a>
<h1>#{title}</h1>
<p>#{text}</p>
</div>
</div>
</body>
</html>
使用以下代码,我找不到id="configparse_port_list"
参数:
require 'nokogiri'
doc = Nokogiri::HTML(<<EOT)
<!DOCTYPE html>
<html>
<head>
<title>SIAM</title>
<link href="/assets/application-49cce08127ac99204d4cb59e3bfaab8e.css" media="all" rel="stylesheet" type="text/css" />
<script src="/assets/application-50259c7e8f6a002b7166ab714e68857b.js" type="text/javascript"></script>
<script src="/assets/controllers/configparse_ports-925b92a6e41f7ffc3014e351d29291fc.js" type="text/javascript"></script>
<meta content="authenticity_token" name="csrf-param" />
<meta content="FFh3mbfqnLZhWclBmQ/kEeYSJPeQvapaC0tK9f4wWH8=" name="csrf-token" />
</head>
<body class="configparse_ports_index ctrl_configparse_ports" data-controller="configparse_ports" data-action="index">
<div id="header">
<a href="https://siam-pro.qa.domain.com/"><img alt="domain_logo" src="/assets/domain_logo-0e44a80f1d9f1f9ce8fb7aa35dbc008b.png" /></a>
<div>
<div class="product_name">SIAM</div>
<div class="version">v5.1</div>
</div>
<form accept-charset="UTF-8" action="/search/quick.json" class="ignoreDirty" data-remote="true" id="quick_search" method="post"><div style="margin:0;padding:0;display:inline "><input name="utf8" type="hidden" value="✓" /><input name="authenticity_token" type="hidden" value="FFh3mbfqnLZhWclBmQ/kEeYSJPeQvapaC0tK9f4wWH8=" /></div>
<input id="search_testcases" name="search[testcases]" type="hidden" value="true" />
<input id="search_testplans" name="search[testplans]" type="hidden" value="true" />
<input id="search_component_names" name="search[component_names]" type="hidden" value="true" />
<input autocomplete="off" id="search_term" name="search[term]" placeholder="Search" type="text" />
</form>
<ul class="menu">
<li><a href="https://siam-pro.qa.domain.com/">Home</a></li>
<li><a href="/settings">Settings</a></li>
</ul>
</div>
<div id="wrapper">
<div id="content">
<div id="loading">Loading ...</div>
<div id="flash">
</div>
<div id="warning_message"></div>
<h1>Listing Configparse Ports</h1>
<div id="configparse_port_filters" class="filter_wrap">
<h4>Filter </h4>
</div>
<table id="configparse_port_list">
<thead>
<tr>
<th>ID #</th>
<th>Name</th>
<th>ANI Release</th>
<th>Network Configuration</th>
<th>State</th>
</tr>
</thead>
<tbody>
<tr>
#MANY TRS - one of which I'm looking for based on the 3rd td (ANI Release)
</tr>
</tbody>
</table>
</div>
</div>
<div id="sidebar">
<h3>Testcases</h3>
<ul>
<li><a href="/testcases/new">New</a></li>
<li><a href="/search/testcase/new">Search</a></li>
<li><a href="/search/bugzilla_cr/new">Import RTC</a></li>
</ul>
<h3>Testplans</h3>
<ul>
<li><a href="/testplans/new">New</a></li>
<li><a href="/search/testplan/new">Search</a></li>
<li><a href="/testplans">List Active</a></li>
</ul>
<h3>Use Cases</h3>
<ul>
<li><a href="/use_cases/new">New</a></li>
<li><a href="/search/use_case/new">Search</a></li>
<li><a href="/use_cases/manage">Manage</a></li>
</ul>
<h3>Configparse</h3>
<ul>
<li><a href="/configparse_ports/new">New</a></li>
<li><a href="/configparse_ports">List Ports</a></li>
</ul>
<h3>Automation</h3>
<ul>
<li><a href="/automation_suites/new">New</a></li>
<li><a href="/search/automation_suite/new">Search</a></li>
<li><a href="/automation/status">Status</a></li>
</ul>
</div>
<div id="footer">
<div>
<ul class="menu">
<li><a href="mailto:siam-help@domain.com">Email SIAM Support</a></li>
<li><a href="http://agora.domain.com/wiki/SIAM">SIAM WIKI</a></li>
</ul>
<div class="copyright">© 2012 domain Technologies</div>
</div>
</div>
<script id="quick_search_results_template" type="text/html">
<div>
{{#resources}}
<div class="search_result search_result_{{internal_name}}">
<h4>{{name}}</h4>
{{#count}}
<table>
<thead>
<tr>
<th>ID</th>
<th></th>
</tr>
</thead>
<tbody>
{{#results}}
<tr class="search_result_{{id}}">
<td><a href="{{url}}">{{id}}</a></td>
<td class="search_result_name"><a title="{{name}}" href="{{url}}">{{name}}</a></td>
</tr>
{{/results}}
</tbody>
</table>
<a class='more_results' href="{{search_url}}">More results</a>
{{/count}}
{{^results}}
<div class='no_results'>No matches found</div>
{{/results}}
</div>
{{/resources}}
</div>
</script>
<script type="text/html" id="warning_message_template">
<div class="ui-widget" id="warning_message">
<div class="ui-state-highlight ui-corner-all">
<span class="ui-icon ui-icon-info"></span>
<p>{{message}}</p>
</div>
</div>
</script>
<!-- notification template -->
<div id="notifcation-container" style="display:none">
<div id="basic-template">
<a class="ui-notify-cross ui-notify-close" href="#">x</a>
<h1>title</h1>
<p>text</p>
</div>
</div>
</body>
</html>
EOT
运行后,HTML 被解析并准备就绪:
configparse_port_list = doc.at('#configparse_port_list')
configparse_port_list.to_html
# => "<table id="configparse_port_list">n<thead><tr>n<th>ID #</th>n <th>Name</th>n <th>ANI Release</th>n <th>Network Configuration</th>n <th>State</th>n </tr></thead>n<tbody><tr>n #MANY TRS - one of which I'm looking for based on the 3rd td (ANI Release)n </tr></tbody>n</table>"
我会小心做的一件事:
doc.css("#configparse_port_list")
是一个矛盾。 css
用于返回满足特定条件的所有节点。 #configparse_port_list
在文档中只能存在一次,因为它是一个 ID。 Nokogiri 很乐意为css
返回单个元素,但对于不注意代码的其他人来说,这可能会令人困惑。我建议将其编写为 at("#configparse_port_list")
,因为at
将返回单个元素,使其与只有一个 ID 与之匹配的事实保持同步。
configparse_port_list = doc.css('#configparse_port_list').class # => Nokogiri::XML::NodeSet
configparse_port_list = doc.css('#configparse_port_list').size # => 1
这些也有效,只需注意前面关于css
和单个元素的警告:
doc.css('table[@id="configparse_port_list"]').size # => 1
doc.css('table#configparse_port_list').size # => 1
您可能需要检查您的 Nokogiri 和 libXML2 环境是否是最新的:
nokogiri -v
目前的野木是1.6.0
请注意,Nokogiri 对文档不满意:
doc.errors
# => [#<Nokogiri::XML::SyntaxError: Element script embeds close tag>,
# #<Nokogiri::XML::SyntaxError: Element script embeds close tag>,
# #<Nokogiri::XML::SyntaxError: Element script embeds close tag>,
# #<Nokogiri::XML::SyntaxError: Element script embeds close tag>,
# #<Nokogiri::XML::SyntaxError: Element script embeds close tag>,
# #<Nokogiri::XML::SyntaxError: Element script embeds close tag>,
# #<Nokogiri::XML::SyntaxError: Element script embeds close tag>,
# #<Nokogiri::XML::SyntaxError: Element script embeds close tag>,
# #<Nokogiri::XML::SyntaxError: Element script embeds close tag>,
# #<Nokogiri::XML::SyntaxError: Element script embeds close tag>,
# #<Nokogiri::XML::SyntaxError: Element script embeds close tag>,
# #<Nokogiri::XML::SyntaxError: Element script embeds close tag>,
# #<Nokogiri::XML::SyntaxError: Element script embeds close tag>,
# #<Nokogiri::XML::SyntaxError: Element script embeds close tag>,
# #<Nokogiri::XML::SyntaxError: Element script embeds close tag>,
# #<Nokogiri::XML::SyntaxError: Element script embeds close tag>,
# #<Nokogiri::XML::SyntaxError: Element script embeds close tag>,
# #<Nokogiri::XML::SyntaxError: Element script embeds close tag>,
# #<Nokogiri::XML::SyntaxError: Element script embeds close tag>,
# #<Nokogiri::XML::SyntaxError: Element script embeds close tag>]
我被困在pubcookie身份验证服务器后面。一旦我通过了身份验证,我就可以按照我最初尝试的方式访问 html 表(尽管在通过 id 获取节点时使用 .at
更可取)。