无法用硒刮擦Web数据



我正在尝试从https://icostats.com/上的首页表中获取数据。但是只是没有点击。

from selenium import webdriver
browser = webdriver.Chrome(executable_path=r'C:Scraperschromedriver.exe')
browser.get("https://icostats.com")
browser.find_element_by_xpath("""//*[@id="app"]/div/div[2]/div[2]/div[2]/div[2]/div[8]/span/span""").s()
posts = browser.find_element_by_class_name("tdPrimary-0-75")
for post in posts:
    print(post.text)

我遇到的错误:

*

c: python36 python.exe c:/.../pycharmprojects/pyqtps/ico_spyder.py.py.py Trackback(最近的最新通话(:文件 " c:/.../pycharmprojects/pyqtps/ico_spyder.py",第5行,in browser.find_element_by_xpath(""// [@ID =" app"](。点击(( 文件 " c: python36 lib site-packages selenium webdriver remote webdriver.py", 第313行,在find_element_by_xpath中 返回self.find_element(by = by.xpath,value = xpath(文件" c: python36 lib lib site-packages selenium selenium webdriver emoter remote webdriver.py", 第791行,find_lement 'value':value}(['value']文件" c: python36 lib lib site-packages selenium webdriver webdriver remote webdriver.py", 第256行,执行 self.error_handler.check_response(worlds(文件" c: python36 lib site-packages selenium selenium webdriver remote remote ermorhandler.py", 第194行,在check_response中 RAIND EXCECT_CLASS(消息,屏幕,stacktrace(Selenium.common.exceptions.nosuchelementException:消息:没有这样的 元素:无法找到元素: {"方法":" xpath"," selector":"// [@ID =" app"]/div/div/div [2]/div [2]/div [2]/div [2]/div [1]/div [2]"} (会话信息:Chrome = 59.0.3071.115((驱动程序信息: Chromedriver = 2.30.477700 (0057494AD8732195794A7B32078424F92A5FCE41(,平台= Windows NT 6.1.7600 x86_64(

*

编辑

最终使它起作用:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait as wait
browser = webdriver.Chrome(executable_path=r'C:Scraperschromedriver.exe')
browser.get("https://icostats.com")
wait(browser, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, "#app > div > div.container-0-16 > div.table-0-20 > div.tbody-0-21 > div:nth-child(2) > div:nth-child(8)")))
posts = browser.find_elements_by_class_name("thName-0-55")
for post in posts:
    print(post.text)
posts = browser.find_elements_by_class_name("tdName-0-73")
for post in posts:
    print(post.text)

是否有任何方法可以在每个标头/列上迭代并将其导出到CSV文件,而无需这样的每个类?

JavaScript动态生成的必需数据。您需要等到页面上显示:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait as wait
browser = webdriver.Chrome(executable_path=r'C:Scraperschromedriver.exe')
browser.get("https://icostats.com")
wait(browser, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, "div#app>div")))
posts = browser.find_element_by_class_name("tdPrimary-0-75")
for post in posts:
    print(post.text)
  1. 似乎在此行中没有s() method

browser.find_element_by_xpath("//*[@ID =" app"]/span"(。s((

所以,您可能需要的是

browser.find_element_by_xpath("""//*[@id="app"]/div/div[2]/div[2]/div[2]/div[2]/div[8]/span/span""").text
  1. ,由于您想迭代结果,因此:

    posts = browser.find_element_by_class_name("tdPrimary-0-75")

应该是

posts = browser.find_elements_by_class_name("tdPrimary-0-75")

最新更新