遍历一个带有Selenium的li标记,但没有从列表的前两个元素中获取.text



我正试图在https://game-rainbow6.ubi.com/en-us/uplay/player-statistics/dbd1cef3-d69d-4296-a235-ae8d7d70363f/multiplayer在操作符选项卡中(很抱歉,选择选项卡不会更改链接),我很容易进入选项卡,但当我获得li标记并迭代它以获得我需要的所有四个web元素(名称、播放时间、k/d、w/l)时,它会跳过列表中的前两个操作符。其余的都印得很好。我尝试了隐式等待,只是想看看前两个是否没有加载得那么快,但这不起作用,然后我尝试了现在代码中的显式等待,但每次都会超时。我还尝试通过xpath查找元素。这是第一个操作员名称的完整xpath

//*[@id="section"]/div/div/div[2]/div/div[1]/div/div/div/div/article[3]/div[1]/div/div/div/nav/ul/li[1]/div/div[1]/div[1]/div/div[1]/p

我试着做

.//div/div[1]/div[1]/div/div[1]/p 

在for循环中,因为我只需要每个元素的路径尾部,但它仍然跳过前两个运算符。

我创建了一个测试登录,这样人们就可以正确地查看html:

email = UbiTest1337@gmail.com
pwd = Password1
def scrapeOperatorStats(self):
#navigate to operator tab
operator_tab = self.driver.find_element_by_xpath('//* [@id="section"]/div/div/div[2]/div/div[1]/div/div/div/div/article[1]/div[2]/div/div[1]/button')
self.driver.execute_script("arguments[0].click();", operator_tab)
#wait for operator stats elements to load
WebDriverWait(self.driver,10).until(EC.text_to_be_present_in_element((By.XPATH, '//*[@id="section"]/div/div/div[2]/div/div[1]/div/div/div/div/article[3]/div[1]/div/div/div/nav/ul/li[1]/div/div[1]/div[1]/div/div[1]/p')))
#Get the li tag that is a list of all operators and thier respective stats
operator_list_set = self.driver.find_element_by_xpath('//*[@id="section"]/div/div/div[2]/div/div[1]/div/div/div/div/article[3]/div[1]/div/div/div/nav/ul')
operators = operator_list_set.find_elements_by_tag_name('li')
for operator in operators:
operator_stats = operator.find_elements_by_tag_name('p')
for stat in operator_stats:
print(stat.text)

我发现你可以使用get_attribute('innerHTML'),它会获得所有元素,你甚至不需要切换选项卡。

#Get the li tag that is a list of all operators and thier respective stats
operator_list_set = self.driver.find_element_by_xpath('//*[@id="section"]/div/div/div[2]/div/div[1]/div/div/div/div/article[3]/div[1]/div/div/div/nav/ul')
operators = operator_list_set.find_elements_by_tag_name('li')
for operator in operators:
operator_stats = operator.find_elements_by_tag_name('p')
for stat in operator_stats:
print(stat.get_attribute('innerHTML'))

最新更新