我正在尝试使用request .get检索url
import requests
from bs4 import BeautifulSoup
baseurl = "https://www.olx.com.eg/"
headers = {
'User-Agent' : 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.63 Safari/537.36'
}
r = requests.get('https://www.olx.com.eg/jobs/')
soup = BeautifulSoup(r.content, 'lxml')
product_list = soup.findAll('div',class_ = 'ads__item')
print(product_list)
但是它返回一个空列表,因为它甚至没有打开URL。
这里有什么问题?
为requests.get
添加headers=
参数:
import requests
from bs4 import BeautifulSoup
baseurl = "https://www.olx.com.eg/"
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.63 Safari/537.36"
}
r = requests.get("https://www.olx.com.eg/jobs/", headers=headers)
soup = BeautifulSoup(r.content, "lxml")
product_list = soup.findAll("div", class_="ads__item")
print(len(product_list))
打印:
45