火碱网络抓取项目



我正在做 https://fireship.io/lessons/web-scraping-guide/项目,我是网络抓取和火力基地的初学者。我已经进入了"npm 运行服务"部分。但是你如何传入像{"text"这样的对象体:"https://fireship.io 和 https://www.instagram.com"}??在运行"npm run serve"之前,我是否将其输入到我的终端窗口中,是否将其放入索引.js文件中?我没有使用视频中提到的"失眠"软件,也不知道它是什么。我在苹果电脑上。 我不明白如何实际使用这两个网络抓取功能。谁能简单地解释一下? 附上我的终端窗口中的错误。谢谢你的任何帮助!!

i  functions: Beginning execution of "scraper"
>  (node:26619) UnhandledPromiseRejectionWarning: SyntaxError: Unexpected token o in JSON at position 1
>      at JSON.parse (<anonymous>)
>      at /Users/stanleyjeong/Desktop/_CODING/web-scraper/fireship-webscraper/functions/index.js:121:27
>      at cors (/Users/stanleyjeong/Desktop/_CODING/web-scraper/fireship-webscraper/functions/node_modules/cors/lib/index.js:188:7)
>      at /Users/stanleyjeong/Desktop/_CODING/web-scraper/fireship-webscraper/functions/node_modules/cors/lib/index.js:224:17
>      at originCallback (/Users/stanleyjeong/Desktop/_CODING/web-scraper/fireship-webscraper/functions/node_modules/cors/lib/index.js:214:15)
>      at /Users/stanleyjeong/Desktop/_CODING/web-scraper/fireship-webscraper/functions/node_modules/cors/lib/index.js:219:13
>      at optionsCallback (/Users/stanleyjeong/Desktop/_CODING/web-scraper/fireship-webscraper/functions/node_modules/cors/lib/index.js:199:9)
>      at corsMiddleware (/Users/stanleyjeong/Desktop/_CODING/web-scraper/fireship-webscraper/functions/node_modules/cors/lib/index.js:204:7)
>      at /Users/stanleyjeong/Desktop/_CODING/web-scraper/fireship-webscraper/functions/index.js:118:5
>      at /usr/local/lib/node_modules/firebase-tools/lib/emulator/functionsEmulatorRuntime.js:570:20
>  (node:26619) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag `--unhandled-rejections=strict` (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 1)
>  (node:26619) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
⚠  functions: Your function timed out after ~60s. To configure this timeout, see
https://firebase.google.com/docs/functions/manage-functions#set_timeout_and_memory_allocation.
>  /usr/local/lib/node_modules/firebase-tools/lib/emulator/functionsEmulatorRuntime.js:619
>                  throw new Error("Function timed out.");
>                  ^
>  
>  Error: Function timed out.
>      at Timeout._onTimeout (/usr/local/lib/node_modules/firebase-tools/lib/emulator/functionsEmulatorRuntime.js:619:23)
>      at listOnTimeout (internal/timers.js:549:17)
>      at processTimers (internal/timers.js:492:7)

专门针对错误:UnhandledPromiseRejectionWarning: SyntaxError: Unexpected token o in JSON

做了一些研究,并在这里找到了答案:语法错误:JSON中位置1处的意外标记o

事实证明,您不需要在index.js文件中使用JSON.parse()

exports.scraper = functions.https.onRequest((request, response) => {
cors(request, response, async () => {

const body = JSON.parse(request.body);
// const data = await scrapeMetatags(body.text);
const data = await scrapeImages(body.text);
response.send(data)

});
});

最新更新