使用WebRequest登录后,使用HTMLUNIT导航网站



我使用htmlunit的点击功能登录使用表单,因此我决定使用WebRequest登录。

网站登录的方式是表单的提交按钮是单独的URL的Ajax调用。一旦收到该帖子请求的响应,页面会自动重新加载,您将登录。

// Client configuration
    WebClient webClient = new WebClient(BrowserVersion.CHROME);
    webClient.getOptions().setJavaScriptEnabled(true);
    webClient.getOptions().setThrowExceptionOnScriptError(false);
    webClient.getOptions().setCssEnabled(false);
    webClient.setAjaxController(new NicelyResynchronizingAjaxController());
// Get cookies (Not sure if necessary to log in)
    HtmlPage webPage = (HtmlPage)webClient.getPage("https://www.marinetraffic.com/");
    URL cookieURL = new URL("https://www.marinetraffic.com/");
    String cookies = webClient.getCookies(cookieURL).toString();
// Configure request headings
    URL url = new URL("https://www.marinetraffic.com/en/users/ajax_login");
    WebRequest requestSettings = new WebRequest(url, HttpMethod.POST);
    requestSettings.setAdditionalHeader(":authority", "www.marinetraffic.com");
    requestSettings.setAdditionalHeader(":method", "POST");
    requestSettings.setAdditionalHeader(":path", "/en/users/ajax_login");
    requestSettings.setAdditionalHeader(":scheme", "https");
    requestSettings.setAdditionalHeader("accept", "*/*");
    requestSettings.setAdditionalHeader("accept-encoding", "gzip,deflate,sdch");
    requestSettings.setAdditionalHeader("accept-language", "en-US,en;q=0.8");
    requestSettings.setAdditionalHeader("content-type", "application/x-www-form-urlencoded; charset=UTF-8");
    requestSettings.setAdditionalHeader("cookie", cookies);
    requestSettings.setAdditionalHeader("origin", "https://www.marinetraffic.com");
    requestSettings.setAdditionalHeader("referer", "https://www.marinetraffic.com/en/ais/home/centerx:-33.1/centery:21.4/zoom:4");
    requestSettings.setAdditionalHeader("user-agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36");
    requestSettings.setAdditionalHeader("x-requested-with", "XMLHttpRequest");
// Request body with form information  
    requestSettings.setRequestBody("_method=POST&email=dummy%40gmail.com&password=fakepassword&is_ajax=true");

// redirectPage is of type UnexpectedPage
    Page redirectPage = webClient.getPage(requestSettings);
    webClient.waitForBackgroundJavaScript(10 * 1000);
// Console confirms login was a success
    System.out.println(redirectPage.getWebResponse().getContentAsString());
    System.out.println(webClient.getCookies(cookieURL).toString());
// When I try to navigate to the main page I am not logged in
    HtmlPage webPage2 = (HtmlPage)webClient.getPage("https://www.marinetraffic.com/");
    System.out.println(webPage2.asXml());

我还尝试过使用更新的cookie访问主站点的WebRequest呼叫,但这也返回了一个意外的页面。现在我已经正确登录了,我该如何获取HTMLPAGE来浏览网站?

只是尝试像普通用户一样与此页面进行交互

  • 获取页面
  • 在元素中找到标志
  • 单击元素中的符号
  • 找到输入字段
  • 将您的用户ID和密码键入FIELS
  • 定位并单击"按钮"登录

也许您必须在某个地方添加一些等待代码。

最新更新