Selenium: For循环中的重赋问题



我试着用print语句调试我的程序,看看每次迭代过程中发生了什么。

这部分工作正常:该程序总共检查了50种下拉菜单的组合(每年25种)。

这部分不工作:然而,由于某些原因,总计字典只存储了初始的"year"的第二次迭代的输入。for循环。它返回一个长度为25的字典(只有我实际想要的长度的一半)。

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import Select
# General Stuff about the website
path = '/Users/admin/desktop/projects/scraper/chromedriver'
options = Options()
options.headless = True
options.add_argument("--window-size=1920,1200")
driver = webdriver.Chrome(options=options, executable_path=path)
website = 'http://siops.datasus.gov.br/filtro_rel_ges_covid_municipal.php'
driver.get(website)
# Initial Test: printing the title
print(driver.title)
print()
# Dictionary to Store stuff in
totals = {}
### Drop Down Menus ###
state_select = Select(driver.find_element(By.XPATH, '//*[@id="cmbUF"]'))
state_options = state_select.options
year_select = Select(driver.find_element(By.XPATH, '//*[@id="cmbAno"]'))
year_options = year_select.options
# county_select = Select(driver.find_element(By.XPATH, '//*[@id="cmbMunicipio"]'))
# county_select.select_by_value('120025')
# report_select = Select(driver.find_element(By.XPATH, '//*[@id="gesRelatorio"]'))
# report_select.select_by_value('rel_ges_covid_rep_uniao_municipal.php')
# period_select = Select(driver.find_element(By.XPATH, '//*[@id="cmbPeriodo"]'))
# period_select.select_by_value('14')
### Loop through all combinations ###
for year in range(1, 3):
year_select = Select(driver.find_element(By.XPATH, '//*[@id="cmbAno"]'))
year_select.select_by_index(year)
for index in range(0, len(state_options) - 1):
state_select = Select(driver.find_element(By.XPATH, '//*[@id="cmbUF"]'))
state_select.select_by_index(index)
# Click the Submit Button
submit_button = driver.find_element(By.XPATH, '//*[@id="container"]/div[2]/form/div[2]/div/input[2]')
submit_button.click()
# Pulling data from the webpage
nameof = driver.find_element(By.XPATH, '//*[@id="arearelatorio"]/div[1]/div/table[1]/tbody/tr[2]').text
total_balance = driver.find_element(By.XPATH, '//*[@id="arearelatorio"]/div[1]/div/table[3]/tbody/tr[9]/td[2]').text
paid_expenses = driver.find_element(By.XPATH, '//*[@id="arearelatorio"]/div[1]/div/table[4]/tbody/tr[11]/td[4]').text
# Update Dictionary with the new info
totals.update({nameof: [total_balance, paid_expenses, year]})

print([nameof, year])
driver.back()
# Print the final Dictionary and quit
print(len(totals))
print(totals)
driver.quit()

@Alex Karamfilov在他的评论中发现了这一点:

"只是一个大胆的猜测,但是否有可能覆盖字典中相同键的值。因为这是一个字典,键应该是唯一的,这可能是只有第二次迭代值">

的原因。这是我的一个愚蠢的错误。在每次迭代中,键是相同的,所以它只是修改值,而不是创建一个新的键值对。

最新更新