Selenium:Xpath返回空白文本



所以我有这个url:

https://www.amazon.com/RevitaLash-Cosmetics-RevitaBrow-Advanced-Conditioner/product-reviews/B009QZCAM6/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews

我想把每个评论的评分都刮下来,但即使我尝试了一些xpath变体,它也没有得到任何回报。

xpath确实找到了文本为"0"的10个元素;5颗星中的x颗星";当我在具有给定xpath的页面中搜索它时,以下是迄今为止我所拥有的:

from bs4 import BeautifulSoup
import requests
import csv
import os
import pandas as pd
from selenium import webdriver

chromedriver = "path to chromedriver"
driver = webdriver.Chrome(chromedriver)
url = https://www.amazon.com/RevitaLash-Cosmetics-RevitaBrow-Advanced-Conditioner/product-reviews/B009QZCAM6/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews
driver.get(url)
ratings = driver.find_elements_by_xpath('//div[@class="a-section a-spacing-none review-views celwidget"]//div[@class="a-row"]/a[@class="a-link-normal"]/i/span')
#ratings = driver.find_elements_by_xpath('/*//div[@id="cm_cr-review_list"]//i[@data-hook="review-star-rating"]/span[@class="a-icon-alt"]')
#ratings = driver.find_elements_by_xpath('/*//div[@id="cm_cr-review_list"]//i[@data-hook="review-star-rating"]/span')
rating_row = []
for rating in ratings:
rating_row.append(rating.text)

但当我调用rating_row时,它只会在我调用rating_row 时返回空白文本列表

> rating_row
['', '', '', '', '', '', '', '', '', '']

我在这里做错了什么?我该如何解决这个问题?这个结果对于包含其他评论的其他url也是一样的。

rating_row.append(rating.get_attribute('innerHTML'))替换行rating_row.append(rating.text)

您可以尝试以下xpath

//i[@data-hook='review-star-rating']/span

代码:

rating_row = []
for rating in driver.find_elements(By.XPATH, "//i[@data-hook='review-star-rating']/span"):
rating_row.append(rating.text)

我建议您也进行driver.implicitly_wait(30)隐式等待。你可以在这个地方写这篇文章:

driver = webdriver.Chrome(chromedriver)
driver.implicitly_wait(30)

阅读更多关于隐式等待这里

最新更新