我的最终目标是获得页面中验证码图像的base64文本。该页中的代码有
</div>
<div _ngcontent-qjw-c117 class="mb-16">
<img _ngcontent-qjw-c117 alt="Image verification" width="100" height="50" src="data:image/png;base64,iVBOR...AAEOWcJLXLQAAAABJRU5ErkJggg==">
</div>
在Chrome的控制台,以下工作正常:
var yes = document.getElementsByClassName("mb-16")[1].firstElementChild.src;
这太好了。现在我想用木偶来做这件事。
在Puppeteer中,我有以下代码:
import puppeteer from 'puppeteer';
(async () => {
const browser = await puppeteer.launch({headless: true});
const page = await browser.newPage();
await page.goto('theURL');
const element = await page.$('document.getElementsByClassName("mb-16")[1].firstElementChild.src;');
console.log(element);
await browser.close();
})();
这失败:
$ node index.js
file:///Users/.../node_modules/puppeteer-core/lib/esm/puppeteer/common/ExecutionContext.js:225
throw new Error('Evaluation failed: ' + getExceptionMessage(exceptionDetails));
^
Error: Evaluation failed: DOMException: Failed to execute 'querySelector' on 'Document': 'document.getElementsByClassName("mb-16")[1].firstElementChild.src;' is not a valid selector.
at pptr://__puppeteer_evaluation_script__:5:24
at ExecutionContext._ExecutionContext_evaluate (file:///Users/.../node_modules/puppeteer-core/lib/esm/puppeteer/common/ExecutionContext.js:225:15)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async ElementHandle.evaluateHandle (file:///Users/...node_modules/puppeteer-core/lib/esm/puppeteer/common/JSHandle.js:94:16)
at async internalHandler.queryOne (file:///Users.../node_modules/puppeteer-core/lib/esm/puppeteer/common/QueryHandler.js:25:30)
at async ElementHandle.$ (file:///.../node_modules/puppeteer-core/lib/esm/puppeteer/common/ElementHandle.js:93:17)
at async file:///Users/..../index.js:7:19
Node.js v18.12.1
如何从Puppeteer的<img
元素中获得src
?我不高兴地复习了其他类似的问题。
import puppeteer from 'puppeteer';
(async () => {
const browser = await puppeteer.launch({headless: true});
const page = await browser.newPage();
await page.goto('theURL', {
waitUntil: 'networkidle0', // This will solve your issue
});
// ALWAYS USE return INSIDE evaluate BECAUSE IT HAPPENS IN THE DOM AND WE NEED TO RETURN IT TO puppeteer
const element = await page.evaluate(() => {
const element = document.getElementsByClassName("mb-16")[1].firstElementChild.src;
return element
});
console.log(element);
await browser.close();
})();
更多信息:
Page.waitForNetworkIdle()方法- Puppeteer