在页面上进行循环迭代时，中出现了StaleElementReferenceException

设置

我使用Python+Selenium来抓取这个网站的公司信息。

由于网站不允许我简单地加载页面URL，我计划点击列表底部的下一个页面箭头元素，并使用带有计数器的while循环。

代码

browser.get('https://new.abb.com/channel-partners/search#') 
wait.until(EC.visibility_of_element_located((By.CLASS_NAME,'abb-pagination')))
# start while loop and counter
c = 1
while c < 65:        
c += 1
# obtain list of companies element
wait.until(EC.visibility_of_element_located((By.CLASS_NAME,'#PublicWrapper > main > section:nth-child(7) > div:nth-child(2)')))
resultlist = el_css('#PublicWrapper > main > section:nth-child(7) > div:nth-child(2)') 
# loop over companies in list
for company in resultlist.find_elements_by_xpath('div'):

# company name
name = company.find_element_by_xpath('h3/a/span').text
# code to capture more company info follows
# next page arrow element 
next_page_arrow = el_cn('abb-pagination__item--next')    
next_page_arrow.click()

问题

该代码捕获while循环之外的公司信息，即仅捕获第一页。

然而，当插入while循环以迭代页面时，我会得到以下错误：StaleElementReferenceException: stale element reference: element is not attached to the page document (Session info: chrome=88.0.4324.192)

如果我仔细查看，似乎确实捕获了后续页面的resultlist，但resultlist中对公司的循环会产生这个错误。

该怎么办？

最简单的解决方案是使用隐含等待：

driver.get('https://new.abb.com/channel-partners/search#') 
company_name = []
while True:
time.sleep(1)    
company_name+=[elem.text for elem in wait.until(EC.presence_of_all_elements_located((By.XPATH,'//span[@property="name"]')))]
# if next page arrow element still available, click, else break while
if driver.find_elements_by_xpath('//li[@class="abb-pagination__item--next"]/a[contains(@href,"#page")]'):
wait.until(EC.presence_of_element_located((By.XPATH,'//li[@class="abb-pagination__item--next"]/a'))).click()
else:
break
len(company_name)

输出：

你不需要计数器，你可以检查箭头url是否仍然可用，这样，如果添加了页面65、66、[…]，你的逻辑仍然有效。

这里的问题是while太快，并且页面没有及时加载。您也可以保存公司名称的第一个列表，单击下一个箭头并与新列表进行比较，如果两者相同，请稍等，直到新列表与上一个列表不同。

相关内容

最新更新

热门标签：