代码:
from selenium import webdriver
from bs4 import BeautifulSoup
driver=webdriver.Chrome('H:datascience-pythonseliniumchromedriver.exe')
driver.get('https://www.aljazeera.com/news/')
button = driver.find_element_by_id('btn_showmore_b1_418')
driver.execute_script("arguments[0].click();", button)
content = driver.page_source
soup = BeautifulSoup(content, 'html.parser')
container = soup.select('div.topics-sec-item-cont')
titlelist = []
urllist = []
for items in container:
if items is not None:
title = items.find_element_by_xpath('//div[@class="col-sm-7 topics-sec-item-cont"]/a/h2').text
url = items.find_element_by_xpath('//div[@class="col-sm-7 topics-sec-item-cont"]/a')
titlelist.append(title)
urllist.append(url.get_attribute('href'))
print(str(titlelist) + 'n')
print(str(urllist) + 'n')
错误:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-1-acf307cfccb3> in <module>
18 for items in container:
19 if items is not None:
---> 20 title = items.find_element_by_xpath('//div[@class="col-sm-7 topics-sec-item-cont"]/a/h2').text
21 url = items.find_element_by_xpath('//div[@class="col-sm-7 topics-sec-item-cont"]/a')
22
TypeError: 'NoneType' object is not callable
title = items.find_element_by_xpath('//div[@class="col-sm-7 topics-sec-item-cont"]/a/h2').text
这行代码中提供的xpath没有返回对象。这个问题的出现只有两个原因:
- 要么提供的xpath错误
- 您试图从中提取的div尚未完全加载