我尝试从站点检索 POST 数据并尝试多次/与 nokogiri、uri、机械化结合使用,但我只从 get 请求中检索数据。我没有看到感兴趣的我div 的内容。
以下是从本网站获取的正文。我正在寻找内容div id="list2"。有一张带有用户及其电话号码的表格。
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="Description" content="Wyszukiwarka" />
<meta name="Author" content="LR" />
<title>Tel</title>
<link href="styleblue.css" rel="stylesheet" type="text/css" />
<script type="text/javascript" src="includes/scripts.js"></script>
<script type="text/javascript" src="includes/jquery-1.6.1.min.js"></script>
<script type="text/javascript" src="includes/jquery.form.js"></script>
<link rel="stylesheet" type="text/css" href="img/themes/blue/style.css" />
<link rel="stylesheet" type="text/css" href="img/themes/ui/smoothness/jquery-ui-1.8.13.custom.css" media="screen"/>
<script type="text/javascript" src="includes/jquery-ui-1.8.13.custom.min.js"></script>
<script type="text/javascript" src="includes/ui.datepicker-pl.js"></script>
<script type="text/javascript">
$(document).ready(function(){
gridReloadTel();
})
</script></head>
<body><table style="width: 100%; margin: 0px; padding: 0px; vertical-align:top" cellpadding="0" cellspacing="0">
<tr class="hideen">
<td style="width: 100%"><table cellpadding="0" cellspacing="0" style="width:100%; margin:0px; padding:0px;">
<tr>
<td id="top_left_login" style="height: 101px"></td>
<td style="height: 101px"><img alt="" src="img/top.jpg" /></td>
<td id="top_right_login" style="height: 101px"><div style="position:relative; width:194px; left:-207px; bottom:36px; text-align:right ">Czwartek <span style="color:#FFFFFF;">03-04-2014</span></div></td>
</tr>
</table></td>
</tr>
<tr class="hideen">
<td id="menu"><div >
<img src="img/blue/mline.jpg" border="0" alt="" /><a href="index.php">Wyszukiwarka</a><img src="img/blue/mline.jpg" border="0" alt="" /><a href="aktualizacja.php">Aktualizacja danych</a><img src="img/blue/mline.jpg" border="0" alt="" /><a href="pomoc.php">Pomoc</a><img src="img/blue/mline.jpg" border="0" alt="" />
</div>//Content
</div>
<br /><br />
<div id="list2">I LOOKING FOR THIS DIV</div>
<br />
</div>
<blockquote style="font-size:10px ">
* aktualizacje <br/>
<img src="img/plus.gif" width="18" height="18" />
</blockquote></td>
</tr>
<tr class="hideen">
<td style="width: 100%"><div id="bottom" align="center"><img src="img/bzit.jpg" width="225" height="42" border="0" alt="" /></div></td>
</tr>
</table>
</body>
</html>
当我在Firebug中检查网站时,我看到GET url/index.php和POST url/grids/search.php。本网站位于本地网站。当我转到选项卡XHR时,开机自检搜索在哪里.php明白了
Connection Keep-Alive
Content-Type text/html
Date Thu, 03 Apr 2014 05:31:44 GMT
Keep-Alive timeout=15, max=100
Server Apache
Transfer-Encoding chunked
X-Powered-By PHP/5.2.5
和Accept */*
Accept-Encoding gzip, deflate
Accept-Language pl,en-US;q=0.7,en;q=0.3
Cache-Control no-cache
Connection keep-alive
Content-Length 99
Content-Type application/x-www-form-urlencoded; charset=UTF-8
Host url
Pragma no-cache
Referer url/index.php
User-Agent Mozilla/5.0 (Windows NT 5.1; rv:28.0) Gecko/20100101 Firefox/28.0
X-Requested-With XMLHttpRequest
接下来是我感兴趣的选项卡响应
`<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="Description" content="Wyszukiwarka telefonów" />
<meta name="Author" content="LR" />
<title>tel</title>
<link rel="stylesheet" type="text/css" href="/img/themes/blue/style.css" />
</head>
<body>
<div id="contenttable">
<table class="scroll" cellpadding="0" cellspacing="0" width="100%" >
<thead >
<tr>
<td colspan="11">Lista wyników *</td>
</tr>
</thead>
<tbody >
ROWS WITH TELEPHONES
</tbody>
</table>
<table class="scroll" cellpadding="0" cellspacing="0" width="100%" >
<tbody >
</tbody>
<tfoot align="center">
<tr>
<td colspan="11" style="text-align:left"><img src="img/themes/blue/images/first.png" onclick="jQuery('#page').val(1);gridReloadTel()" /> <img src="img/themes/blue/images/prev.png" onclick="jQuery('#page').val(1);gridReloadTel()" />
<input id="page" type="text" value="2" size="3" maxlength="5" onkeydown="doSearchTel(arguments[0]||event)" />
/ 802 <img src="img/themes/blue/images/next.png" onclick="jQuery('#page').val(3);gridReloadTel()" /> <img src="img/themes/blue/images/last.png" onclick="jQuery('#page').val(802);gridReloadTel()" /> | wyświetl
<select id="rows" name="rows" onchange="gridReloadTel()">
<option value="15" selected >15</option>
<option value="25" >25</option>
<option value="50" >50</option>
<option value="200" >200</option>
</select>
| 12016 wierszy</td>
</tr>
</tfoot>
</table>
</div>
<div style="position:absolute; top:140px; right:20px;" class="hideen"><form action="export.php" method="post" target="_blank" id="exportform" name="exportform" >
<a href="javascript:document.exportform.submit();" onmouseout="MM_swapImgRestore()" onmouseover="MM_swapImage('xlsex','','img/xls_down.jpg',1)"><img src="img/xls_up.jpg" name="xlsex" border="0" id="xlsex" title="Wygeneruj spis wyb" /></a>
<input name="sord" type="hidden" value="PRNazwa asc" /><input name="where" type="hidden" value=" 1=1 " />
<input type="hidden" name="start" value="15" />
<input type="hidden" name="limit" value="15" />
</form></div>
<script type="text/javascript">
var _gaq = _gaq || [];
_gaq.push(['_setAccount', '']);
_gaq.push(['_trackPageview']);
(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
</script>
</body></html>`
如何从div id='contenttable' 检索此数据?任何答案,想法对我都非常有帮助。
尝试机械化
@agent = Mechanize.new do |a|
a.user_agent_alias = 'Windows Chrome'
a.log = Logger.new "activity.log"
a.get 'url/index.php'
end
现在,您可以使用
@agent.post('url/grids/search.php', "foo" => "bar", headers go here)
若要获取查询参数和标头,请参阅开发人员工具中的请求标头