单击了使用Python Selenium进行x-click但不重定向的元素



我正在尝试用python selenium抓取下面给出的url。

https://www.rtilinks.com/?5b5483ba2d=OUhWbXlXOGY4cEE0VEtsK1pWSU5CdEJob0hiR0xFNjN2M252ZXlOWnp0RC9yaFpvN3ZNeW9SazlONWJSTWpvNGNpR0FwWUZwQWduaXdFY202bkcrUHAybkVDc0hMMk9EWFdweitsS0xHa0U9

这是我的代码

from pprint import pprint
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.action_chains import ActionChains
from PIL import Image
import requests
from time import sleep
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
wd = webdriver.Chrome('chromedriver',options=chrome_options)
wd.get("https://www.rtilinks.com/?5b5483ba2d=OUhWbXlXOGY4cEE0VEtsK1pWSU5CdEJob0hiR0xFNjN2M252ZXlOWnp0RC9yaFpvN3ZNeW9SazlONWJSTWpvNGNpR0FwWUZwQWduaXdFY202bkcrUHAybkVDc0hMMk9EWFdweitsS0xHa0U9
")
WebDriverWait(wd, 20).until(EC.element_to_be_clickable((By.ID, "soralink-human-verif-main"))).click()
sleep(10)
WebDriverWait(wd, 20).until(EC.element_to_be_clickable((By.XPATH, "//img[@id='showlink' and @x-onclick]"))).click()

在运行完这段代码后,我应该被重定向到https://rareapk.com/finance/?n1p0ei2ng5yd3gz但它停留在同一页。

下面给出了我正在单击的元素。

<img class="spoint" id="showlink" x-onclick="changeLink()" src="https://eductin.com/wp-content/uploads/2021/06/Download.png">

元素图像

我的代码在做什么

  • 首先转到这个url
  • 然后单击I'M NOT A ROBOT
  • 加载下一页之后,selenium将等待10秒
  • 然后单击一个图像(具有文本DOWNLOAD RTI(,该图像应将其重定向到REDIRECTED URL

但在最后一步中,它停留在同一个url,它不重定向

我尝试了以下方法

  • WebDriverWait(wd, 20).until(EC.element_to_be_clickable((By.XPATH, "//img[@id='showlink' and @x-onclick]"))).click()
  • wd.find_element(By.ID, "showlink").click()

我测试了没有headless的代码,我看到浏览器打开了预期的页面,但wd.current_url仍然显示旧的URL(wd.title也显示旧的标题(

所有的问题都可能是因为页面在新的tab中打开了新的URL,并且需要使用wd.switch_to_window(...)来访问其他tab

此代码使用switch_to_window(...),并在其他tab中显示正确的URL(和标题(。

BTW:我不得不添加"Consent",因为我的浏览器有时会显示它。

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from time import sleep
from webdriver_manager.chrome import ChromeDriverManager, ChromeType
#from webdriver_manager.firefox import GeckoDriverManager
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
#wd = webdriver.Chrome('chromedriver', options=chrome_options)
wd = webdriver.Chrome(service=Service(ChromeDriverManager(chrome_type=ChromeType.CHROMIUM).install()), options=chrome_options)
#wd = webdriver.Firefox(service=Service(GeckoDriverManager().install()))
wd.get("https://www.rtilinks.com/?5b5483ba2d=OUhWbXlXOGY4cEE0VEtsK1pWSU5CdEJob0hiR0xFNjN2M252ZXlOWnp0RC9yaFpvN3ZNeW9SazlONWJSTWpvNGNpR0FwWUZwQWduaXdFY202bkcrUHAybkVDc0hMMk9EWFdweitsS0xHa0U9")
p = wd.current_window_handle
print('current_window_handle:', p)
try:
print('Waiting for: "Consent"')
WebDriverWait(wd, 20).until(EC.element_to_be_clickable((By.XPATH, "//button[@aria-label='Consent']"))).click()
except Exception as ex:
print('Exception:', ex)

print('Waiting for: "I'm not a robot"')
WebDriverWait(wd, 20).until(EC.element_to_be_clickable((By.ID, "soralink-human-verif-main"))).click()
print('Waiting for: "Download (RTI)"')
WebDriverWait(wd, 20).until(EC.element_to_be_clickable((By.XPATH, "//img[@id='showlink' and @x-onclick]"))).click()
print('--- active tab ---')
print('current_window_handle:', p)
print('current_url:', wd.current_url)
print('title:', wd.title)
print('--- other tabs ---')
chwd = wd.window_handles
for w in chwd:
#switch focus to child window
if w != p:
wd.switch_to.window(w)
print('current_window_handle:', w)
print('current_url:', wd.current_url)
print('title:', wd.title)
print('---')

wd.close()        

结果:

Waiting for: "Consent"
Waiting for: "I'm not a robot"
Waiting for: "Download (RTI)"
--- active tab ---
current_window_handle: CDwindow-31FDEC2C62AA0666A8F3A1DD2133D02C
current_url: https://eductin.com/how-to-fix-and-restore-deleted-mac-system-files/
title: How to fix and Restore deleted Mac system files. – Eductin
--- other tabs ---
current_window_handle: CDwindow-CB1EAE5B6DCD4ACF5D061ED4ECC314CD
current_url: https://sakarnewz.com/
title: SakarNewz – BOOST YOUR KNOWLEDGE WITH TECH NEWS AND UPDATES
---

编辑:

有时,此代码在显示有关其他选项卡的信息时会遇到问题,因为选项卡似乎一直运行JavaScript,并且Selenium可能无法访问数据。

最新更新