这是我第一次尝试web抓取。我试图按州查找汽油价格。我做的第一个代码是
url = "https://www.gasbuddy.com/usa/la"
page = requests.get(url)
soup = BeautifulSoup(page.content, 'html.parser')
title = soup.find(id= "Nevada").get_text()
price = soup.find("div", class_="col-sm-2 col-xs-3 text-right").get_text()
print(price)
print(title)
现在我想让它让用户可以输入状态。在第一个程序中,我只选择了一个状态,并将其写为
title = soup.find(id= "Nevada").get_text()
我该怎么做才能让工作
State = input("Input Your State ")
title = soup.find(id= State ).get_text()
该网站受cloudflare保护。这就是为什么你不能用正常的请求来抓取它。您可以使用cloudscraper模块进行刮擦。安装:pip install cloudscraper
代码:
import cloudscraper
scraper = cloudscraper.create_scraper()
url = "https://www.gasbuddy.com/usa/la"
page = scraper.get(url)
soup = BeautifulSoup(page.content, 'html.parser')
state = input("Input Your State ").strip()
title = soup.find(id= state ).get_text()
price = soup.find("div", class_="col-sm-2 col-xs-3 text-right").get_text()
print(price)
print(title)