CasperJS不跟随ASP网站上的链接



我试图模仿用ASP构建的网站上的浏览器行为,这似乎使用了很多基于javascript的链接和使用CasperJS的UI。我被卡住了,不知道下一步该怎么做。

我使用:casperjs1.1.0-beta3, phantomjs1.9.8,和网站的url是https://icsid.worldbank.org/apps/ICSIDWEB/cases/Pages/AdvancedSearch.aspx

这是我要点击的HTML链接:

<td>
    <a href="javascript:__doPostBack('ctl00$m$g_ba040fcb_44f7_44fa_92d0_d088c5679794$ctl00$gvCasedetails','Page$3')">3</a>
</td>

该网站有一些SSL配置问题,所以CasperJS运行一些额外的标志工作:casperjs --ignore-ssl-errors=true --ssl-protocol=tlsv1 icsid.js

icsid.js只是试图打开网站并点击链接以进入下一页的结果。我要检查所有的结果。

var casper = require('casper').create({
    clientScripts: ["./jquery.min.js"],
    verbose: true,
    logLevel: 'debug',
    pageSettings: {
        loadImages: false,
        loadPlugins: false,
        userAgent: 'Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.2 Safari/537.36',
    }
});
casper.start('https://icsid.worldbank.org/apps/ICSIDWEB/cases/Pages/AdvancedSearch.aspx', function(){});
casper.then(function() {
    this.wait(5000);
    this.capture('screenshot0.png');
    casper.then(function(){
        var text = this.evaluate(function(){
            return jQuery('.gdcol a')[0].text;
        });
        console.log('text: ' + text);
        this.evaluate(function(){
            // try to go to second page
            return jQuery('a').filter(function(index) { return $(this).text() === "2"; })[0].click();
        });
    });

    casper.then(function(){
        this.wait(5000);
        var size = this.evaluate(function(){
            return jQuery('.gdcol a').size();
        });
        console.log('size: ' + size);
        // if successfully clicked and changed url, the link text will change
        var text = this.evaluate(function(){
            return jQuery('.gdcol a')[0].text;
        });
        console.log('text: ' + text);
        // if it's still on the first page, this will be null
        var page = this.evaluate(function(){
            return jQuery('a').filter(function(index) { return $(this).text() === "1"; })[0].text;
        });
        console.log('page: ' + page);
        // if it's on the second page, this will be null
        var page = this.evaluate(function(){
            return jQuery('a').filter(function(index) { return $(this).text() === "2"; })[0].text;
        });
        console.log('page: ' + page);
        this.capture('screenshot1.png');
    });
});
casper.run();

这是结果日志:

[info] [phantom] Starting...
[info] [phantom] Running suite: 3 steps
[debug] [phantom] opening url: https://icsid.worldbank.org/apps/ICSIDWEB/cases/Pages/AdvancedSearch.aspx, HTTP GET
[debug] [phantom] Navigation requested: url=https://icsid.worldbank.org/apps/ICSIDWEB/cases/Pages/AdvancedSearch.aspx, type=Other, willNavigate=true, isMainFrame=true
[debug] [phantom] url changed to "https://icsid.worldbank.org/apps/ICSIDWEB/cases/Pages/AdvancedSearch.aspx"
2015-07-23 11:48:31.255 phantomjs[10699:d13] CoreText performance note: Client called CTFontCreateWithName() using name "Arial" and got font with PostScript name "ArialMT". For best performance, only use PostScript names when calling this API.
2015-07-23 11:48:31.256 phantomjs[10699:d13] CoreText performance note: Set a breakpoint on CTFontLogSuboptimalRequest to debug.
2015-07-23 11:48:31.278 phantomjs[10699:d13] CoreText performance note: Client called CTFontCreateWithName() using name "Arial" and got font with PostScript name "ArialMT". For best performance, only use PostScript names when calling this API.
2015-07-23 11:48:31.279 phantomjs[10699:d13] CoreText performance note: Client called CTFontCreateWithName() using name "Arial" and got font with PostScript name "ArialMT". For best performance, only use PostScript names when calling this API.
2015-07-23 11:48:31.280 phantomjs[10699:d13] CoreText performance note: Client called CTFontCreateWithName() using name "Arial" and got font with PostScript name "ArialMT". For best performance, only use PostScript names when calling this API.
2015-07-23 11:48:31.280 phantomjs[10699:d13] CoreText performance note: Client called CTFontCreateWithName() using name "Arial" and got font with PostScript name "ArialMT". For best performance, only use PostScript names when calling this API.
2015-07-23 11:48:31.479 phantomjs[10699:d13] CoreText performance note: Client called CTFontCreateWithName() using name "Arial" and got font with PostScript name "ArialMT". For best performance, only use PostScript names when calling this API.
2015-07-23 11:48:31.480 phantomjs[10699:d13] CoreText performance note: Client called CTFontCreateWithName() using name "Arial" and got font with PostScript name "ArialMT". For best performance, only use PostScript names when calling this API.
[debug] [phantom] Automatically injected ./jquery.min.js client side
[debug] [phantom] Successfully injected Casper client-side utilities
[info] [phantom] Step anonymous 2/3 https://icsid.worldbank.org/apps/ICSIDWEB/cases/Pages/AdvancedSearch.aspx (HTTP 200)
[info] [phantom] Step anonymous 2/3: done in 1886ms.
[info] [phantom] Step anonymous 3/3 https://icsid.worldbank.org/apps/ICSIDWEB/cases/Pages/AdvancedSearch.aspx (HTTP 200)
[debug] [phantom] Capturing page to /Users/yubrew/app/lib/tasks/screenshot0.png
[info] [phantom] Capture saved to /Users/yubrew/app/lib/tasks/screenshot0.png
[info] [phantom] Step anonymous 3/3: done in 2347ms.
[info] [phantom] Step _step 4/6 https://icsid.worldbank.org/apps/ICSIDWEB/cases/Pages/AdvancedSearch.aspx (HTTP 200)
[info] [phantom] Step _step 4/6: done in 2351ms.
[info] [phantom] wait() finished waiting for 5000ms.
[info] [phantom] Step anonymous 5/6 https://icsid.worldbank.org/apps/ICSIDWEB/cases/Pages/AdvancedSearch.aspx (HTTP 200)
text: ARB/15/30
[info] [phantom] Step anonymous 5/6: done in 7377ms.
[info] [phantom] Step anonymous 6/6 https://icsid.worldbank.org/apps/ICSIDWEB/cases/Pages/AdvancedSearch.aspx (HTTP 200)
size: 50
text: ARB/15/30
page: null
page: 2
[debug] [phantom] Capturing page to /Users/yubrew/app/lib/tasks/screenshot1.png
[info] [phantom] Capture saved to /Users/yubrew/app/lib/tasks/screenshot1.png
[info] [phantom] Step anonymous 6/6: done in 7491ms.
[info] [phantom] Step _step 7/7 https://icsid.worldbank.org/apps/ICSIDWEB/cases/Pages/AdvancedSearch.aspx (HTTP 200)
[info] [phantom] Step _step 7/7: done in 7493ms.
[info] [phantom] wait() finished waiting for 5000ms.
[info] [phantom] Done 7 steps in 12493ms

请注意,屏幕截图和控制台日志表明页面内容没有更改。

你不应该使用Element.click(),因为它在大多数时候在PhantomJS中不做任何事情。使用CasperJS的click()函数,它会尝试许多事情来正确单击元素。

由于CSS选择器不支持基于文本的匹配,因此可以使用XPath来解决这个问题:

var x = require('casper').selectXPath;
...
casper.click(x("//a[text()='2']"));

你的另一个误解是casper.wait()实际上有所作为。then*()wait*()函数都是异步步进函数。当你调用它们时,你只是在调度一个应该在当前步骤结束时执行的步骤。

例如:

this.wait(5000);
this.capture('screenshot0.png');
this.then(function(){...

它像这样执行:

this.capture('screenshot0.png');
this.wait(5000);
this.then(function(){...

如果可以,应该将同步函数调用封装在casper.then()中。

最新更新