Selenium:如果链接存在，如何使用while循环点击链接

我正在尝试编写一个Python程序，该程序使用Selenium来单击按钮，如果按钮是可单击的，则转到下一页。这是因为我是从不同数量的网页抓取。

我尝试使用while循环来检查href属性，但代码没有单击按钮，也没有返回错误。如果我只是简单地编写button.click((，但没有while循环或href属性的条件检查，那么程序会正确地单击按钮。

我的代码还有一个while循环条件"；变量不是None"；。这是"；不是"；？我的逻辑是，如果有href可供单击，则程序单击按钮转到下一页。

代码：

import requests
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
import time
import numpy as np
import pandas as pd
PATH = "C:Program Files (x86)chromedriver.exe"
wd = webdriver.Chrome(PATH)
wd.get("https://profiles.ucr.edu/app/home/search;name=;org=Physics%20and%20Astronomy;title=;phone=;affiliation=Faculty")
time.sleep(1)

button = wd.find_element_by_xpath("""//a[@aria-label='Next page']""")
#<a tabindex="0" aria-label="Next page" class="ng-star-inserted" style=""> Next <span class="show-for-sr">page</span></a>
href_data = button.get_attribute('href')
while (href_data is not None):
time.sleep(0.5)
button.click()
href_data = button.get_attribute('href')

这里有人愿意帮助我吗？我知道Selenium需要用户下载一个网络驱动程序，所以我对测试我的代码时遇到的任何困难表示歉意。

谢谢您，ExactPlace441

循环直到所有页面都被点击。

wd.get('https://profiles.ucr.edu/app/home/search;name=;org=Physics%20and%20Astronomy;title=;phone=;affiliation=Faculty')
wait=WebDriverWait(wd, 10)
while True:
try:
wait.until(EC.element_to_be_clickable((By.XPATH, "//a[@aria-label='Next page']"))).click()
time.sleep(5)
except:
break

导入

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC

我遇到了同样的问题，然后我使用了gecko驱动程序(selenium Firefox(而不是Chrome。我的代码在seleniumFirefox中运行得很好，但在selenium Chrome中并没有运行相同的代码。没有while循环，我在selenium Chrome浏览器中点击按钮没有任何问题，但添加while循环时它不起作用。在使用gecko驱动程序(seleniumFirefox(后，我的问题得到了解决。下面是您可以使用的while循环的一个示例。它会点击按钮，直到按钮消失或到达最后一页。

i = 1
try:
while i < 2: 
button_element = driver.find_element_by_xpath("give your button xpath")
button_element.click() #Our loop will continuing until our button xpath disappeared from web page  
except:
pass #when the button xpath will disappeared it will ignore the error and jump to the next section of our code.

在这里我修改了你的代码：

import requests
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
import time
import numpy as np
import pandas as pd

driver =  webdriver.Firefox()
driver.maximize_window()
url = "https://profiles.ucr.edu/app/home/search;name=;org=Physics%20and%20Astronomy;title=;phone=;affiliation=Faculty"
driver.get(url)

timeout = 20
# This container collect data from first page 
containers = WebDriverWait(driver, timeout).until(EC.visibility_of_all_elements_located((By.XPATH,'//div[@class="column ng-star-inserted"]' )))
for container in containers:
name = container.find_element_by_css_selector('.header-details h5') #we are srcaping name from each page
print(name.text)
i = 1
try:
while i < 2: #Now it will look for “next page button” in every page and continuing click on “next page button”  until it will reach the last page.   
next_page_button = driver.find_element_by_xpath("//li[@class='pagination-next ng-star-inserted']")
next_page_button.click()
#our this container2 start collect data from second page to last page
containers = WebDriverWait(driver, timeout).until(EC.visibility_of_all_elements_located((By.XPATH,'//div[@class="column ng-star-inserted"]' )))
for container in containers:
name = container.find_element_by_css_selector('.header-details h5') #we are srcaping name from each page
print(name.text)
time.sleep(3)
except:
pass #if any page don't have “next page button” then our code will be end without any error.

相关内容

最新更新

热门标签：