Youtube网络抓取评论由新



我正在尝试在youtube上抓取一个评论区。但在此之前,我想按新评论进行排序。所以我必须点击排序,然后点击最新优先。但不幸的是,我没有运气。谢谢你的帮助。

屏幕记录:https://i.stack.imgur.com/efZVl.jpg

代码:

import sys, unittest, time, datetime
import urllib.request, urllib.error, urllib.parse
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import InvalidArgumentException
from selenium.webdriver import Chrome
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.select import Select
from selenium.webdriver import ActionChains
from bs4 import BeautifulSoup
import requests
import requests.exceptions
from urllib.parse import urlsplit
from collections import deque
import re
import os
import shutil
import smtplib
import string
import pyautogui
options = webdriver.ChromeOptions()
options.add_argument('--lang=en')
options.add_argument("--start-maximized")
driver = webdriver.Chrome(executable_path=r'C:UserscaspeOneDriveDocumentsÖvrigtKodningEmailchromeDriverchromedriver.exe', chrome_options=options)
driver.get("https://www.youtube.com/watch?v=EV6PLN_8RBw")
time.sleep(5)
screenWidth, screenHeight = pyautogui.size()
currentMouseX, currentMouseY = pyautogui.position()
pyautogui.moveTo(1050, 780) # Move the mouse to XY coordinates.
pyautogui.click()
print("clicked")
time.sleep(1)
pyautogui.moveTo(1250, 880) # Move the mouse to XY coordinates.
pyautogui.click() 
print("clicked")
time.sleep(1)
pyautogui.moveTo(700, 500) # Move the mouse to XY coordinates.
pyautogui.click() 
print("clicked")
time.sleep(1)

我强烈建议首先用替换autogui部件element=driver.find_element_by_xpath或通过css_selector和element.click在您从";最新的";选项,然后再次单击,我在另一个脚本中做到了,没有问题,我不能正确格式化它,因为我在安卓上tho

tl;dr:用selenium替换autogui,首先单击使选项可选择的按钮,然后再次单击要为排序的选项

编辑:好吧,我自己试过了,pyautogui不适合我,不确定是因为我使用了另一个驱动程序还是其他什么,但如果它适合你,那没关系。问题是,你需要先向下滚动一点,才能加载评论。你可以使用

for i in range(5):
driver.execute_script("arguments[0].scrollBy(0, 500)", element)
time.sleep(2)

不确定5是否足够/太多,因为我无法检查自己,但基本上就是这样。然后,您可以使用定位下拉打开按钮

element = driver.find_element_by_xpath("//*[@id="label"]")
element.click
#now we select the option from the dropdown options
element = driver.find_element_by_xpath("/html/body/ytd-app/div/ytd-page-manager/ytd-watch-flexy/div[4]/div[1]/div/ytd-comments/ytd-item-section-renderer/div[1]/ytd-comments-header-renderer/div[1]/span/yt-sort-filter-sub-menu-renderer/yt-dropdown-menu/paper-menu-button/iron-dropdown/div/div/paper-listbox/a[2]/paper-item")  
element.click

你也可以使用explicitWaits来确保它在视野中,但我觉得使用它还不舒服,你可以在这里找到它们
尝试一下,让我知道它是否有效:D

最新更新