使用Pandas、Beautiful Soup或Selenium中的下拉菜单帮助浏览表数据

我正试图从这个网站上抓取数据：

https://www.shanghairanking.com/rankings/grsssd/2021

起初，熊猫会把我带出大门，我可以刮桌子，但我很难处理下拉菜单。我想选择总分框旁边的选项，即PUB、CIT等。当我检查元素时，它看起来可能像Javascript，而对这些选项进行交互的常用方法不起作用。我尝试过Beutifal汤和最近的Selenium来手工选择下拉菜单。这适用于默认的表数据''

import time
import pandas as pd
from selenium import webdriver
from selenium.webdriver.support.ui import Select
driver = webdriver.Chrome('/Users/martinbell/Downloads/chromedriver')
driver.get('https://www.shanghairanking.com/rankings/grsssd/2021')
submit = driver.find_element_by_xpath("//input[@value='CIT']").click()

"我哪儿也不去。

您的代码将无法工作，因为您必须首先单击打开的下拉列表，然后遍历下拉列表中的选项。这是重构后的代码。

请注意，我使用time.sleep是为了即时的目的，但为了获得健壮的代码和良好的实践，请使用显式等待，如WebdriverWait

driver.get('https://www.shanghairanking.com/rankings/grsssd/2021')
time.sleep(10)
driver.find_element(By.XPATH, "(//*[@class='inputWrapper'])[3]").click()
#The below commented code loops through all the dropdown options and performs actions.
# opt_ele = driver.find_elements(By.XPATH, "(//*[@class='rank-select'])[2]//*[@class='options']//li")
# for ele in opt_ele:
#     print(ele.text)
#     ele.click()
#     print('perform your actions here')
#     driver.find_element(By.XPATH, "(//*[@class='inputWrapper'])[3]").click()
# If you do not want to loop through but just want to select only CIT, here is the line:
driver.find_element(By.XPATH, "(//*[@class='rank-select'])[2]//*[@class='options']//li[text()='CIT']").click()

相关内容

最新更新

热门标签：