我试图从这个亚马逊页面检索项目的价格,URL:
https://www.amazon.com/FANMATS-University-Longhorns-Chrome-Emblem/dp/B00EPDLL6U/
源代码from bs4 import BeautifulSoup
import requests
text = "https://www.amazon.com/FANMATS-University-Longhorns-Chrome-Emblem/dp/B00EPDLL6U/"
page = requests.get(text)
data = page.text
soup = BeautifulSoup(data, 'lxml')
web_text = soup.find_all('div')
print(web_text)
每次运行程序时,我都会得到一个与网页完全不同的html输出,比如:
"对不起!我们这边出了点问题。请返回并再试一次…">
我不知道我做错了什么,任何帮助都会非常感激。我是新的python和网络抓取,所以我很抱歉,如果我的问题是超级明显。谢谢!:)
网站正在动态提供请求无法处理的内容,请使用selenium:
from selenium import webdriver
from bs4 import BeautifulSoup
url = 'https://www.amazon.com/FANMATS-University-Longhorns-Chrome-Emblem/dp/B00EPDLL6U/'
driver = webdriver.Chrome('C:Program FilesChromeDriverchromedriver.exe')
driver.get(url)
time.sleep(3)
html = driver.page_source
soup = BeautifulSoup(html,'html.parser')
print(soup.select_one('span#priceblock_ourprice').get_text())
driver.close()