如何将bs4.element.ResultSet转换为日期/string



我想提取网站上文章的日期和摘要,这是我的代码

from bs4 import BeautifulSoup
from selenium import webdriver
full_url = 'https://www.wsj.com/articles/readers-favorite-summer-recipes-11599238648?mod=searchresults&page=1&pos=20'
url0 = full_url
browser0 = webdriver.Chrome('C:/Users/liuzh/Downloads/chromedriver_win32/chromedriver')
browser0.get(url0)

html0 = browser0.page_source
page_soup = BeautifulSoup(html0, 'html5lib')
date = page_soup.find_all("time", class_="timestamp article__timestamp flexbox__flex--1")
sub_head = page_soup.find_all("h2", class_="sub-head")
print(date)
print(sub_head)

我得到了以下结果,如何获得标准表格?(例如,美国东部时间2020年9月4日下午12:57;这个劳动节周末,我们…(

[<time class="timestamp article__timestamp flexbox__flex--1">
Sept. 4, 2020 12:57 pm ET
</time>]
[<h2 class="sub-head" itemprop="description">This Labor Day weekend, we’re savoring the last of summer with a collection of seasonal recipes shared by Wall Street Journal readers. Each one comes with a story about what this food means to a family and why they return to it each year.</h2>]

谢谢。

尝试以下操作:

for d in date:
print(d.text.strip())

给定您的示例html,输出应该是:

Sept. 4, 2020 12:57 pm ET

最新更新