我尝试在Python中使用Selenium:
title1
link1
title2
link2…
目前我有这个代码:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
import time
PATH = r"C:UsersDesktoppymsedgedriver.exe"
driver = webdriver.Edge(PATH)
driver.maximize_window()
driver.get('https://www.google.com/')
searchbar = driver.find_element(by=By.CLASS_NAME, value='gLFyf')
searchbar.send_keys('selenium')
searchbar.send_keys(Keys.RETURN)
titles = driver.find_elements(by=By.CLASS_NAME, value='LC20lb')
links = driver.find_elements(by=By.TAG_NAME, value='a')
for link in links:
href = link.get_attribute('href')
print(href)
for title in titles:
print(title.text)
time.sleep(5)
driver.quit()
但是,打印出来的链接是Google搜索链接,而不是网站本身的链接。此外,所有的链接都在标题之前打印出来(我理解为什么会发生这种情况,但不知道如何修复)
请问解决这两个问题的方法是什么?提前谢谢你。
将代码中的for
循环替换为,
for i, link in enumerate(links):
try:
print(titles[i].text)
except:
pass
print(link.get_attribute("href"));print()
输出——
Selenium Tutorial for Beginners: Learn WebDriver & Testing
https://www.google.com/search?q=selenium&source=lnms&tbm=bks&sa=X&ved=2ahUKEwiHuNKqiOv3AhXITWwGHZXxBlwQ_AUoAXoECAIQAw
Selenium: Definition, How it works and Why you need it
https://www.google.com/search?q=selenium&source=lnms&tbm=isch&sa=X&ved=2ahUKEwiHuNKqiOv3AhXITWwGHZXxBlwQ_AUoAnoECAIQBA
What Is Selenium � A Tutorial on How to Use ... - LambdaTest
https://www.google.com/search?q=selenium&source=lnms&tbm=vid&sa=X&ved=2ahUKEwiHuNKqiOv3AhXITWwGHZXxBlwQ_AUoA3oECAIQBQ