我有list_of_links,其中包含2个视频,有创作者少于500个订阅者和一个视频是有大量的订阅者。我在这里的任务是只获取那些拥有少于500个订阅者的创建者。我尝试过这样做,但是它给了我一个错误,这个traceback
NoSuchElementException Traceback (most recent call last)
~AppDataLocalTempipykernel_182643954802536.py in <module>
5 driver.get(link)
6 x_path_for_followers = '//*[@id="owner-sub-count"]'
----> 7 followers = driver.find_element(By.XPATH, value=x_path_for_followers)
8 nFollowers = getNumberOfFolowers(followers) # a duncion to parse from the text here to number
9 if nFollowers > 500:
这是我尝试过的:
list_of_links = ['https://www.youtube.com/watch?v=rXUW-BfPNx8', 'https://www.youtube.com/watchv=b9s_xrycvGA', 'https://www.youtube.com/watch?v=k9TUPpGqYTo']
creators = set()
for link in list_of_links:
driver.get(link)
x_path_for_followers = '//*[@id="owner-sub-count"]'
followers = driver.find_element(By.XPATH, value=x_path_for_followers)
nFollowers = getNumberOfFolowers(followers) # a duncion to parse from the text here to number
if nFollowers > 500:
continue
x_path_for_creator = '//*[@id="text"]/a'
creator = driver.find_element(By.XPATH, value=x_path_for_creator)
href = creator.get_attribute("href")
creators.append(href)
我想得到这两个创作者(他们是两个第一个视频的创作者,他们有不到500个订阅者):@translationalbiology5464, @imjustslightlybetter
正如错误明确指出的那样,它无法通过给定的XPath找到元素。
你可以这样做:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument("--start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
driver = webdriver.Chrome(options=options)
def getNumberOfFolowers(text):
text = text.replace(' subscribers', '')
if 'K' in text:
num = float(text.replace('K', ''))*1000
elif 'M' in text:
num = float(text.replace('M', '')) * 1000000
else:
num = text
return int(num)
list_of_links = ['https://www.youtube.com/watch?v=iBk60Pv_BMo', 'https://www.youtube.com/watch?v=Cnwt5emWaOg', 'https://www.youtube.com/watch?v=OBXtdBnEUvo', 'https://www.youtube.com/watch?v=HZ8uXq5VG2w', 'https://www.youtube.com/watch?v=03c8M6LZR_k']
creators = set()
for link in list_of_links:
driver.get(link)
WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.CSS_SELECTOR, 'div[id="owner"]')))
followers = driver.find_element(By.CSS_SELECTOR, 'div[id="owner"]').text.split('n')[1]
nFollowers = getNumberOfFolowers(followers)
if nFollowers > 500:
continue
creator = driver.find_element(By.XPATH, '//*[@id="text"]/a').get_attribute('href')
creator_id = creator.replace('https://www.youtube.com/', '')
creators.add(creator_id)
print(creators)
创造者:
{'@ajeet214_', '@DeltaFilmsStudio'}