如何在加载有限数量的页面末尾滚动?硒- Python



我想滚动到页面末尾,例如:https://fr.finance.yahoo.com/quote/GM/history?period1=1290038400&period2=1612742400&interval=1d&filter=history&frequency=1d&includeAdjustedClose=true

事实是这样的:

# # Get scroll height after first time page load
last_height = driver.execute_script("return document.body.scrollHeight")
while True:
# Scroll down to bottom
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
# Wait to load page
time.sleep(2)()
# Calculate new scroll height and compare with last scroll height
new_height = driver.execute_script("return document.body.scrollHeight")
if new_height == last_height:
break
last_height = new_height

不起作用。是的,它应该适用于无限负载的页面,但不适用于雅虎财经,它具有有限数量的负载,但条件应该在到达终点时打破。我现在很困惑。

我们也可以使用:

while driver.find_element_by_tag_name('tfoot'):
# Scroll down three times to load the table
for i in range(0, 3):
driver.execute_script("window.scrollBy(0, 5000)")
time.sleep(2)

但有时在某些负载下阻塞。

最好的方法是什么?

需要pip install undetected-chromedriver,但会完成工作。这只是我的webdriver的选择,你也可以做完全相同的正常硒。

from time import sleep as s
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
import undetected_chromedriver as uc
options = uc.ChromeOptions()
options.headless = False
driver = uc.Chrome(options=options)
driver.get('https://fr.finance.yahoo.com/quote/GM/history?period1=1290038400&period2=1612742400&interval=1d&filter=history&frequency=1d&includeAdjustedClose=true')
WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, '#consent-page > div > div > div > div.wizard-body > div.actions.couple > form > button'))).click() #clicks the cookie warning or whatever
last_scroll_pos=0
while True:
WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, 'body'))).send_keys(Keys.DOWN)
s(.01)
current_scroll_pos=str(driver.execute_script('return window.pageYOffset;'))
if current_scroll_pos == last_scroll_pos:
print('scrolling is finished')
break
last_scroll_pos=current_scroll_pos

最新更新