木偶师:从单击输入标签按钮后不刷新的页面中抓取 html



我试图在单击输入标签按钮后尝试抓取一些HTML。我正在使用page.evaluate()单击按钮,因为page.click()似乎对输入标签按钮不起作用。我尝试使用无头的视觉调试:Puppeteer启动选项中的False,以验证浏览器在单击按钮后确实导航到点。我不确定为什么page.content()在单击按钮之前返回HTML,而不是事件发生后的HTML。

const puppeteer = require('puppeteer');
const url = 'http://www.yvr.ca/en/passengers/flights/departing-flights';
const fs = require('fs');
const tomorrowSelector = '#flights-toggle-tomorrow'
puppeteer.launch().then(async browser => {
    const page = await browser.newPage();
    await page.goto(url);
    await page.evaluate((selector)=>document.querySelector(selector).click(),tomorrowSelector);
    let html = await page.content();
    await fs.writeFile('index.html', html, function(err){
        if (err) console.log(err);
        console.log("Successfully Written to File.");
    });
   await browser.close();
  });

您可以单击收音机的标签。另外,您需要等待改变状态的迹象(对于XHR/Fetch响应或新选择器)。例如,此代码对我有用,但是您可以使用任何其他条件或等待几秒钟。

const fs = require('fs');
const puppeteer = require('puppeteer');
const url = 'http://www.yvr.ca/en/passengers/flights/departing-flights';
const tomorrowLabelSelector = 'label[for=flights-toggle-tomorrow]';
const tomorrowLabelSelectorChecked = '.yvr-form__toggle:checked + label[for=flights-toggle-tomorrow]';
puppeteer.launch({ headless: false }).then(async (browser) => {
  const page = await browser.newPage();
  await page.goto(url);
  await Promise.all([
    page.click(tomorrowLabelSelector),
    page.waitForSelector(tomorrowLabelSelectorChecked),
  ]);
  const html = await page.content();
  await fs.writeFile('index.html', html, (err) => {
    if (err) console.log(err);
    console.log('Successfully Written to File.');
  });
  // await browser.close();
});

最新更新