我在python flask中使用selenium编写了一个程序,该程序对一个产品进行网络抓取。我的目的是首先输入产品名称(这是通过编程完成的)-在显示产品之后->它应该显示产品的价格,我将显示在终端。然而,我的问题是,它不刮网站,它抛出一个Selenium NoSuchElementException
。这是我的代码。
def scrape_johnLewis(product_name):
website_address = 'https://www.johnlewis.com/'
options = webdriver.ChromeOptions()
options.add_argument('start-maximized')
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
browser = webdriver.Chrome(ChromeDriverManager().install(), options=options)
browser.get(website_address)
time.sleep(10)
browser.implicitly_wait(20)
browser.find_element_by_css_selector('button.c-button-yMKB7 c-button--primary-39fbj c-button--inverted-UZv88 c-button--primary-3tLoH').click()
browser.find_element_by_id('mobileSearch').send_keys(product_name)
browser.find_element_by_css_selector('div.input-with-submit.module-inputWrapper--63f9e > button.button.module-c-button--fe2f1').click()
time.sleep(5)
# browser.find_elements_by_class_name('button.module-c-button--fe2f1')[0].submit()
product_price_raw_list = browser.find_elements_by_xpath('//div[@class="info-section_c-product-card__section__2D2D- price_c-product-card__price__3NI9k"]/span')
product_price_list = [elem.text for elem in product_price_raw_list]
print(product_price_list)
if __name__ == "__main__":
scrape_johnLewis('Canon EOS 90D Digital SLR Body')
我得到的错误在这里browser.find_element_by_css_selector('button.c-button-yMKB7 c-button--primary-39fbj c-button--inverted-UZv88 c-button--primary-3tLoH').click()
,这里是堆栈跟踪:
Traceback (most recent call last):
File "scrapejohnLewis.py", line 32, in <module>
scrape_johnLewis('Canon EOS 90D Digital SLR Body')
File "scrapejohnLewis.py", line 20, in scrape_johnLewis
browser.find_element_by_css_selector('button.c-button-yMKB7 c-button--primary-39fbj c-button--inverted-UZv88 c-button--primary-3tLoH').click()
File "/home/mayureshk/.local/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 598, in find_element_by_css_selector
return self.find_element(by=By.CSS_SELECTOR, value=css_selector)
File "/home/mayureshk/.local/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 978, in find_element
'value': value})['value']
File "/home/mayureshk/.local/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "/home/mayureshk/.local/lib/python3.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":"button.c-button-yMKB7 c-button--primary-39fbj c-button--inverted-UZv88 c-button--primary-3tLoH"}
(Session info: chrome=74.0.3729.108)
(Driver info: chromedriver=74.0.3729.6 (255758eccf3d244491b8a1317aa76e1ce10d57e9-refs/branch-heads/3729@{#29}),platform=Linux 5.0.0-1034-oem-osp1 x86_64)
我试图用find_element_by_tag_name替换它,但是这也没有完成工作。通过检查网站,我已经找到了确切的元素,但令我惊讶的是,错误显示没有这样的元素。究竟是什么情况呢?请帮助。
cookie提示管理不当,它没有消失,因此它阻塞了下面的代码,从而阻止Selenium
寻找元素。我还对代码做了一些调整。
from selenium.webdriver.common.keys import Keys
def scrape_johnLewis(product_name):
website_address = 'https://www.johnlewis.com/'
options = webdriver.ChromeOptions()
options.add_argument('start-maximized')
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
browser = webdriver.Chrome(ChromeDriverManager(log_level='0').install(), options=options)
browser.get(website_address)
time.sleep(3)
# browser.implicitly_wait(20)
browser.find_element_by_xpath('//*[@id="pecr-cookie-banner-wrapper"]/div/div[1]/div/div[2]/button[1]').click()
browser.find_element_by_id('desktopSearch').send_keys(product_name + Keys.ENTER)
time.sleep(5)
product_price_raw_list = browser.find_elements_by_xpath('//div[@class="info-section_c-product-card__section__2D2D- price_c-product-card__price__3NI9k"]/span')
product_price_list = [elem.text for elem in product_price_raw_list]
print(product_price_list)
# ouptut
['£1,249.00', '£1,629.99', '£1,349.99']
也尝试使用css_selector
作为最后的手段,如果你不能找到更好的元素定位器,如id
,xpath
,tag_name
。
有一次,我得到了同样的错误。你会想知道它是如何被修复的?
我错误地保留了我的selenium web driver,然后从那里复制xpath,我的代码工作了。
然后,我发现webdriver windows中特定元素的xpath与实际浏览器窗口中相同元素的xpath不同.在webdriver窗口的路径中添加了一个额外的标签但不是在实际浏览器窗口.
也许你那边也有同样的问题。