我在web上抓取了以下检查过的标签和类,它返回None



我正试图从该页面中获取信息(标题、主题、日期…(http://www.tiki-toki.com/timeline/entry/594418/Greenpeace/#vars!日期=2050-10-20_02:52:36!使用beautifulsoup。当我打印出来检查它是否与我在网上检查的内容相似时,它返回"0";无";。

from bs4 import BeautifulSoup
import requests
html_text = requests.get('http://www.tiki-toki.com/timeline/entry/594418/Greenpeace#vars!date=2050-10-20_02:52:36!').text
soup = BeautifulSoup(html_text, 'lxml')
events = soup.find('div', class_ = 'tl-story-block tl-story-category-view-standard tl-sb-low-height')
print(events)

你们能帮我解决这个问题吗?非常感谢!

这是因为页面使用动态javascript加载事件。因此,您将无法使用requests来完成它。相反,您应该将selenium与网络驱动程序结合使用,以便在抓取之前加载所有事件。

您可以尝试在此处下载ChromeDriver可执行文件。如果你把它粘贴到与脚本相同的文件夹中,你就可以运行:

import os
from selenium import webdriver
from bs4 import BeautifulSoup
# configure driver
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("--headless")
chrome_driver = os.getcwd() + "\chromedriver.exe"  # Change this to your path if not same folder
driver = webdriver.Chrome(options=chrome_options, executable_path=chrome_driver)
driver.get('http://www.tiki-toki.com/timeline/entry/594418/Greenpeace#vars!date=2050-10-20_02:52:36!')
soup = BeautifulSoup(driver.page_source, "html.parser")
events = soup.find_all('div', class_='tl-story-block tl-story-category-view-standard tl-sb-low-height')

for event in events:
topic = event.find('h5')
date = event.find('h4')
title = event.find('h3')
print(title.text, ' - ',  date.text, ' - ', topic.text)

输出

Woolworths’ commitment to 100% renewable electricity by 2025  -  November 2020  -  Climate
Bunnings commits to 100% renewable electricity by 2025  -  October 2020  -  Climate
Supermarket chain ALDI commits to to purchase 100% of its electricity from...  -  August 2020  -  Climate
Italian giant UniCredit to phase out coal by 2028  -  August 2020  -  Climate
Greenpeace Spain wins court case to stop dismantling of Low Emissions Zone...  -  June 2020  -  Toxics
Greenpeace NL wins court case against Dutch state for oil rig protest  -  June 2020  -  Climate
Senegalese government reject 52 new  industrial fishing licenses  -  June 2020  -  Oceans
...

相关内容

最新更新