Selenium:如果链接存在,如何使用while循环点击链接



我正在尝试编写一个Python程序,该程序使用Selenium来单击按钮,如果按钮是可单击的,则转到下一页。这是因为我是从不同数量的网页抓取。

我尝试使用while循环来检查href属性,但代码没有单击按钮,也没有返回错误。如果我只是简单地编写button.click((,但没有while循环或href属性的条件检查,那么程序会正确地单击按钮。

我的代码还有一个while循环条件";变量不是None";。这是";不是";?我的逻辑是,如果有href可供单击,则程序单击按钮转到下一页。

代码:

import requests
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
import time
import numpy as np
import pandas as pd
PATH = "C:Program Files (x86)chromedriver.exe"
wd = webdriver.Chrome(PATH)
wd.get("https://profiles.ucr.edu/app/home/search;name=;org=Physics%20and%20Astronomy;title=;phone=;affiliation=Faculty")
time.sleep(1)

button = wd.find_element_by_xpath("""//a[@aria-label='Next page']""")
#<a tabindex="0" aria-label="Next page" class="ng-star-inserted" style=""> Next <span class="show-for-sr">page</span></a>
href_data = button.get_attribute('href')
while (href_data is not None):
time.sleep(0.5)
button.click()
href_data = button.get_attribute('href')

这里有人愿意帮助我吗?我知道Selenium需要用户下载一个网络驱动程序,所以我对测试我的代码时遇到的任何困难表示歉意。

  • 谢谢您,ExactPlace441

循环直到所有页面都被点击。

wd.get('https://profiles.ucr.edu/app/home/search;name=;org=Physics%20and%20Astronomy;title=;phone=;affiliation=Faculty')
wait=WebDriverWait(wd, 10)
while True:
try:
wait.until(EC.element_to_be_clickable((By.XPATH, "//a[@aria-label='Next page']"))).click()
time.sleep(5)
except:
break

导入

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC

我遇到了同样的问题,然后我使用了gecko驱动程序(selenium Firefox(而不是Chrome。我的代码在seleniumFirefox中运行得很好,但在selenium Chrome中并没有运行相同的代码。没有while循环,我在selenium Chrome浏览器中点击按钮没有任何问题,但添加while循环时它不起作用。在使用gecko驱动程序(seleniumFirefox(后,我的问题得到了解决。下面是您可以使用的while循环的一个示例。它会点击按钮,直到按钮消失或到达最后一页。

i = 1
try:
while i < 2: 
button_element = driver.find_element_by_xpath("give your button xpath")
button_element.click() #Our loop will continuing until our button xpath disappeared from web page  
except:
pass #when the button xpath will disappeared it will ignore the error and jump to the next section of our code. 

在这里我修改了你的代码:

import requests
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
import time
import numpy as np
import pandas as pd

driver =  webdriver.Firefox()
driver.maximize_window()
url = "https://profiles.ucr.edu/app/home/search;name=;org=Physics%20and%20Astronomy;title=;phone=;affiliation=Faculty"
driver.get(url)

timeout = 20
# This container collect data from first page 
containers = WebDriverWait(driver, timeout).until(EC.visibility_of_all_elements_located((By.XPATH,'//div[@class="column ng-star-inserted"]' )))
for container in containers:
name = container.find_element_by_css_selector('.header-details h5') #we are srcaping name from each page
print(name.text)
i = 1
try:
while i < 2: #Now it will look for “next page button” in every page and continuing click on “next page button”  until it will reach the last page.   
next_page_button = driver.find_element_by_xpath("//li[@class='pagination-next ng-star-inserted']")
next_page_button.click()
#our this container2 start collect data from second page to last page
containers = WebDriverWait(driver, timeout).until(EC.visibility_of_all_elements_located((By.XPATH,'//div[@class="column ng-star-inserted"]' )))
for container in containers:
name = container.find_element_by_css_selector('.header-details h5') #we are srcaping name from each page
print(name.text)
time.sleep(3)
except:
pass #if any page don't have “next page button” then our code will be end without any error.   

最新更新