使用Pandas、Beautiful Soup或Selenium中的下拉菜单帮助浏览表数据



我正试图从这个网站上抓取数据:

https://www.shanghairanking.com/rankings/grsssd/2021

起初,熊猫会把我带出大门,我可以刮桌子,但我很难处理下拉菜单。我想选择总分框旁边的选项,即PUB、CIT等。当我检查元素时,它看起来可能像Javascript,而对这些选项进行交互的常用方法不起作用。我尝试过Beutifal汤和最近的Selenium来手工选择下拉菜单。这适用于默认的表数据''

import time
import pandas as pd
from selenium import webdriver
from selenium.webdriver.support.ui import Select
driver = webdriver.Chrome('/Users/martinbell/Downloads/chromedriver')
driver.get('https://www.shanghairanking.com/rankings/grsssd/2021')
submit = driver.find_element_by_xpath("//input[@value='CIT']").click()

"我哪儿也不去。

您的代码将无法工作,因为您必须首先单击打开的下拉列表,然后遍历下拉列表中的选项。这是重构后的代码。

请注意,我使用time.sleep是为了即时的目的,但为了获得健壮的代码和良好的实践,请使用显式等待,如WebdriverWait

driver.get('https://www.shanghairanking.com/rankings/grsssd/2021')
time.sleep(10)
driver.find_element(By.XPATH, "(//*[@class='inputWrapper'])[3]").click()
#The below commented code loops through all the dropdown options and performs actions.
# opt_ele = driver.find_elements(By.XPATH, "(//*[@class='rank-select'])[2]//*[@class='options']//li")
# for ele in opt_ele:
#     print(ele.text)
#     ele.click()
#     print('perform your actions here')
#     driver.find_element(By.XPATH, "(//*[@class='inputWrapper'])[3]").click()
# If you do not want to loop through but just want to select only CIT, here is the line:
driver.find_element(By.XPATH, "(//*[@class='rank-select'])[2]//*[@class='options']//li[text()='CIT']").click()

最新更新