运行一个 Javascript 函数来填充一个表,然后使用 Jsoup 解析 HTML 页面



我正在为我的工作做一个项目,该项目允许用户解析给定的HTML页面,该页面提供有关潜在客户的信息。然而,我面临的问题是,网页在一个表中显示这些线索信息,根据我所理解的,从Javascript函数中填充,所以当Jsoup解析文档时,它找不到表或其任何内容。这是我专门关注的 HTML:

<table class="none" align="center" bgcolor="white" border="0" cellpadding="1" cellspacing="0" width="100%">
<tbody><tr class="tm_tt_ftr1">
<td class="typedata1">&nbsp;</td>
<td class="typedata1" colspan="3">Name</td>
<td class="typedata1">Phone</td>
<td class="typedata1x" colspan="2">$$$ Summary&nbsp;</td>
</tr>
<tr class="tm_tt_body">
<td class="typedata1" title="Lookup this name historical"><center>
<a href="#" onclick="javascript:Pop_Up('X','Testerson',
'Testerson','Tes','Test');">
N</a></center></td>
<td class="typedata1" colspan="3">&nbsp;Testerson, Test           </td>
<td class="typedata1">
<b><a href="rtrpt.cgi?DATE_OPT=US_TERSE
&amp;RT_SCRIPT=mkcnt/cnt_lookup_phone_cgi.rt&amp;JDATE=TODAY
&amp;DATE1=TODAY&amp;DATE2=TODAY&amp;QSRC=ALL&amp;DETAIL=N
&amp;QPAC=631&amp;QPRE=384&amp;QPNUM=6191" title="Search phone history this number" target="_new">P1:</a></b>
<a href="rtrpt.cgi?
DATE_OPT=X&amp;RT_SCRIPT=mkcnt/lead_phn_cgi.rt
&amp;LEAD=011876280" title="Additional phone numbers this lead" target="_new">
<b>222-222-2222</b></a>
</td>
<td width="10%">Charge&nbsp;</td>
<td width="10%">    49.00</td>
</tr>
<tr class="tm_tt_body">
<td class="typedata1" title="Lookup this name historical" colspan="1"><center>
&nbsp;</center></td>
<td class="typedata1" colspan="3">&nbsp;</td>
<td class="typedata1">&nbsp;
&nbsp;
<b>               </b>
</td>
<td class="fd_tt_body_neg">Paid&nbsp;</td>
<td class="fd_tt_body_neg" colspan="1">    49.00</td> <!--This is what I am looking to extract -->
</tr>
<tr class="tm_tt_body">
<td>&nbsp;</td>
<td class="typedata1" colspan="3">9 Daniel Ln&nbsp;</td>
<td class="typedata1" colspan="1">Email
<a id="ld_email" href="mailto:testtesterson@gmail.com?subject='L11876280'">
testtesterson@gmail.com</a>
</td>
<td>Due&nbsp;</td>
<td>     0.00</td>
</tr>
<tr class="tm_tt_body">
<td>&nbsp;</td>
<td class="typedata1" colspan="3">&nbsp;</td>
<td class="typedata1" colspan="1">CB  @ -------</td>
<td class="typedata1" colspan="1">&nbsp;</td>
<td class="typedata1" colspan="1">1B&nbsp;</td>
</tr>
<tr class="tm_tt_body">
<td class="typedata1"><center> 111</center></td>
<td class="typedata1" colspan="3">Springfield NY 11953</td>
<td class="typedata1" colspan="1">Comm:&nbsp;1314379</td>
<td colspan="2"><center>DC: ., .</center></td>
</tr>
<tr class="tm_tt_body">
<td class="typedata1" colspan="5">&nbsp;</td>
<td colspan="2">&nbsp;
</td>
</tr>
</tbody></table>

如上所述,Jsoup根本找不到这个表,或者它的任何内容。包含此表的div 具有如下 Javascript 函数:

<script language="Javascript">
function UpdateDiv(){
$.ajax({
url: "http://flag.60north.net/cgi-bin/rtrpt_tabpanel2G_New.cgi", 
type: 'POST', 
async: true, 
dataType: 'html', 
data: "RT_SCRIPT=telemkt/prime/leadcgiUpDate_New.rt&DATE_OPT=X&DETAIL=N&LNUM=" + $("input#LNUM").val(), 
timeout: 90000, 
success: 
function(retData){ 
$(".Lead_Info").html(retData);
}
});
}
</script>

根据我从中了解到的情况,调用这些函数是为了填充表。我想做的是有一种方法来运行该函数,以便用潜在客户的信息填充页面,然后使用 Jsoup 解析它。从我的个人研究中,我发现Selenium API允许在HTML文档中执行Javascript函数,但是,我认为这并不能解决我的问题。据我所知,无论Selenium运行什么,都不会对Jsoup解析HTML产生影响,因为它会连接到url并检索文档。显然,如果 Jsoup 有能力这样做,我会让 Jsoup 运行函数然后解析,但这不是一个可用的功能。为了显示此潜在客户信息,下一个最佳解决方案是什么?

您可以尝试此方法:

WebDriver driver = new ChromeDriver();
driver.get(url);
JavascriptExecutor js = (JavascriptExecutor) driver;
js.executeScript("UpdateDiv();");

然后,从WebDriver中提取html并传递给JSoup进行解析和其他操作:

String html = driver.getPageSource();
Document doc = Jsoup.parse(html);

最新更新