无法在亚马逊中提取商品标题



当我尝试使用以下代码了解Sony耳机的标题时,代码的结果是None

import requests    
from bs4 import BeautifulSoup
URL = 'https://www.amazon.com/Sony-Noise-Cancelling-Headphones- 
WH1000XM3/dp/B07G4MNFS1/ref=sxin_0_ac_d_rm?ac_md=0-0-c29ueQ%3D%3D- 
ac_d_rm&keywords=sony&pd_rd_i=B07G4MNFS1&pd_rd_r=3e6d5325-8ee4-4ba8-a84f- 
1b7cf2ce98bf&pd_rd_w=BVSFq&pd_rd_wg=I0LMZ&pf_rd_p=e2f20af2-9651-42af-9a45- 
89425d5bae34&pf_rd_r=VGT25BXXZNDE3B61A994&psc=1&qid=1577253649&smid=ATVPDKIKX0DER'
headers = {"User-Agent":"Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like 
Gecko) Chrome/79.0.3945.88 Safari/537.36"}
page = requests.get(URL, headers=headers)    
soup = BeautifulSoup(page.content, "html.parser")
soup.prettify()
#print(soup)
title = soup.find_all('span', {'id':'productTitle'})                        
print(title, len(title))   

电流输出为:

[ ] 0

我花了最后两个小时试图用BeautifulSoup来刮这个标题。我尝试抓取页面上的其他元素。没有成功。我尝试将原始内容发送到文件,但由于存在奇怪的字符而中断。

我尝试了艾哈迈德的答案,仍然没有得到。我尝试了一堆我在网上找到的其他解决方案,但仍然没有得到。我一辈子都想不出如何使用BeautifulSoup来刮这个。

我知道你使用硒,所以这里是硒溶液。

from selenium import webdriver
bot = webdriver.Chrome()
bot.get("https://www.amazon.com/Sony-Noise-Cancelling-Headphones-WH1000XM3/dp/B07G4MNFS1/ref=sxin_0_ac_d_rm?ac_md=0-0-c29ueQ==-ac_d_rm&keywords=sony&pd_rd_i=B07G4MNFS1&pd_rd_r=3e6d5325-8ee4-4ba8-a84f-1b7cf2ce98bf&pd_rd_w=BVSFq&pd_rd_wg=I0LMZ&pf_rd_p=e2f20af2-9651-42af-9a45-89425d5bae34&pf_rd_r=VGT25BXXZNDE3B61A994&psc=1&qid=1577253649&smid=ATVPDKIKX0DER")
title = bot.find_element_by_id('productTitle').text
print(title)
bot.close()
import requests
from bs4 import BeautifulSoup
r = requests.get("https://www.amazon.com/Sony-Noise-Cancelling-Headphones-WH1000XM3/dp/B07G4MNFS1/ref=sxin_0_ac_d_rm?ac_md=0-0-c29ueQ==-ac_d_rm&keywords=sony&pd_rd_i=B07G4MNFS1&pd_rd_r=3e6d5325-8ee4-4ba8-a84f-1b7cf2ce98bf&pd_rd_w=BVSFq&pd_rd_wg=I0LMZ&pf_rd_p=e2f20af2-9651-42af-9a45-89425d5bae34&pf_rd_r=VGT25BXXZNDE3B61A994&psc=1&qid=1577253649&smid=ATVPDKIKX0DER")
soup = BeautifulSoup(r.text, 'html.parser')
for item in soup.findAll("span", {'id': 'productTitle'}):
print(item.get_text(strip=True))

输出:

Sony Noise Cancelling Headphones WH1000XM3: Wireless Bluetooth Over the Ear Headphones with Mic and Alexa voice control - Industry Leading Active Noise Cancellation - Black

在线运行代码:单击此处

最新更新