小贝子编程

Python Web Scrapping错误403，即使标头用户代理

本文关键字：用户代理 Web Scrapping 错误 Python python web-scraping url user-agent http-status-code-403
更新时间 : 2024-04-20
英文 : Python Web Scrapping Error 403 even with header User Agent

我是一个学习Python的新手。使用BeautifulSoup和Requests来废弃"https://batdongsan.com.vn/nha-dat-ban-tp-hcm"为了收集我家乡的房价数据，即使尝试了Headers User Agent，我还是被403错误阻止了。下面是我的代码:

* * url3 ="https://batdongsan.com.vn/nha-dat-ban-tp-hcm"

headers = {"User-Agent"Mozilla/5.0 (Windows NT 10.0;Win64;x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.114 Safari/537.36 Edg/103.0.1264.49"}

page = requests。Get (url3, headers = headers)

打印(页面)* *

Result:

有人尝试并成功地绕过了同样的问题吗?任何帮助都是非常感谢的。

多谢

import cloudscraper
scraper = cloudscraper.create_scraper()
soup = BeautifulSoup(scraper.get("https://batdongsan.com.vn/nha-dat-ban-tp-hcm").text)
print(soup.text) ## do what you want with the response

你可以安装cloudscraper与pip install cloudscraper

Python Web Scrapping错误403，即使标头用户代理

相关内容

最新更新

热门标签：