给网络抓取器增加复杂性

如何进一步自动化我的网络剪贴簿？目前，它可以搜索奇异常量url。如何添加一个功能，使其在一个固定的网站中搜索多个页面。

这是我的代码

const PORT = 8000
const axios = require('axios')
const cheerio = require('cheerio')
const express = require('express')
const app = express ()
const url = 'heresurdata'
axios(url)
.then(response => {
const html = response.data
console.log(html)
})
app.listen (PORT, () => console.log('server running on PORT ${PORT}'))

你可以向第一个页面发出请求，找到所有的锚标签，获取它们的链接，向它们发出请求，并重复过程

const {data} = await axios.get(url);
const $ = cheerio.load(data);
const allLinkTags = $('body a').map((index, el) => $(el).attr('href')).get();
// Now look through all the link tags and scrape them

相关内容

最新更新

热门标签：