我正在做一个类项目,使用BeautifulSoup和webdriver在亚马逊上抓取一次性尿布的名称、价格、评论和评级。
我的目标是拥有这样的东西:
Diapers Size 4, 150 Count - Pampers Swaddlers Disposable Baby Diapers, One Month Supply
4.0 out of 5 stars
1,982
$43.98
($0.29/Count)
不幸的是,我在50个数据出现后收到了这样的消息:
消息:没有这样的元素:无法定位元素:{"method":"css选择器","选择器":".a-last"}
这是我的代码:
URL = "https://www.amazon.com/s?k=baby+disposable&rh=n%3A166772011&ref=nb_sb_noss"
driver = ('C:/Users/Desktop/chromedriver_win32/chromedriver.exe')
driver.get(URL)
html = driver.page_source
soup = BeautifulSoup(html, "html.parser")
df = pd.DataFrame(columns = ["Product Name","Rating","Number of Reviews","Price","Price Count"])
while True:
for i in soup.find_all(class_= "sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 s-result-item sg-col-
4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"):
ProductName = i.find(class_= "a-size-base-plus a-color-base a-text-normal").text#.span.get_text
print(ProductName)
try:
Rating = i.find(class_= "a-icon-alt").text#.span.get_text()
except:
Rating = "Null"
print(Rating)
try:
NumberOfReviews = i.find(class_= "a-size-base").text#.span.get_text()
except:
NumberOfReviews = "Null"
print(NumberOfReviews)
try:
Price = i.find(class_= "a-offscreen").text#.span.get_text()
except:
Price = "Null"
print(Price)
try:
PriceCount = i.find(class_= "a-size-base a-color-secondary").text#.span.get_text()
except:
PriceCount = "Null"
print(PriceCount)
df = df.append({"Product Name":ProductName, "Rating":Rating, "Number of Reviews":NumberOfReviews,
"Price":Price, "Price Count":PriceCount}, ignore_index = True)
nextlink = soup.find(class_= "a-disabled a-last")
if nextlink:
print ("This is the last page. ")
break
else:
progress = driver.find_element_by_class_name('a-last').click()
subhtml = driver.page_source
soup = BeautifulSoup(subhtml, "html.parser")
不幸的是,我遇到了一条路障,试图弄清楚为什么不采用a_last
。
由于网页上尚未加载Web元素,因此会发生此错误。在对Web元素执行操作之前,您需要确保该元素存在/加载在网页上。你是怎么做到的?通过实现各种同步/等待方法隐式、显式、Fluent等待-您可以使用这些方法中的任何一种来等待"最后一个"首先出现,然后单击它。对于您的代码,您可以使用显式等待:
WebDriverWait wait = new WebDriverWait(driver,30);
WebElement elementToClick = wait.until(ExpectedConditions. elementToBeClickable(driver.find_element_by_class_name('a-last');
elementToClick.click();