通过selenium从网站获取链接视频.如何获取链接



我想从网站获取视频链接https://www.ofw.su/family-feud-july-29-2022但我做不到。这是我的代码:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys
import time
from datetime import datetime
from random import randint
import random
import string
import os
def get(link):
CHROMEDRIVER_PATH = 'chromedriver.exe'
options = webdriver.ChromeOptions()
options.add_argument("user-data-dir=E:\profile")
options.add_argument("--disable-notifications")
#options.add_argument("--headless")
options.add_experimental_option('excludeSwitches', ['enable-logging'])
driver = webdriver.Chrome(executable_path=CHROMEDRIVER_PATH,options=options)
driver.get(link)
time.sleep(2)
url_video = driver.find_element_by_xpath("/html/body/div/div[2]/div[3]/video").get_attribute('src')
print(url_video)
return url_video

link = "https://www.ofw.su/family-feud-july-29-2022"
get(link)

我没有得到任何链接

您试图访问的元素在iframe中
因此,为了访问iframe中的元素,您必须切换到该iframe,如下所示:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys
import time
from datetime import datetime
from random import randint
import random
import string
import os
def get(link):
CHROMEDRIVER_PATH = 'chromedriver.exe'
options = webdriver.ChromeOptions()
options.add_argument("user-data-dir=E:\profile")
options.add_argument("--disable-notifications")
#options.add_argument("--headless")
options.add_experimental_option('excludeSwitches', ['enable-logging'])
driver = webdriver.Chrome(executable_path=CHROMEDRIVER_PATH,options=options)
driver.get(link)
time.sleep(2)
iframe = driver.find_element_by_xpath("//iframe[@class='embed-responsive-item']")
driver.switch_to.frame(iframe)
url_video = driver.find_element_by_xpath("/html/body/div/div[2]/div[3]/video").get_attribute('src')
print(url_video)
return url_video

link = "https://www.ofw.su/family-feud-july-29-2022"
get(link)

当您完成对iframe内元素的处理时,为了切换到常规内容,您应该使用以下代码:

driver.switch_to.default_content()

此外,您应该使用显式等待而不是硬编码延迟time.sleep(2),并使用相对定位器,而不是像/html/body/div/div[2]/div[3]/video这样的绝对XPaths

最新更新