在RSS提要上使用美丽的汤按日期和关键字过滤



我设法使用美丽的汤 4 从网站上抓取 RSS 提要,但无法按今天的日期过滤,并按关键字过滤要显示的某些新闻的标题和链接。

有没有办法设置输入问题,例如 输入查看新闻的日期: 输入首选关键字:

我当前的代码如下

import bs4 as bs
import urllib.request
import requests
import csv
source = urllib.request.urlopen('https://www.channelnewsasia.com/rssfeeds/8396082').read()
soup = bs.BeautifulSoup(source,'lxml')
print(soup.get_text())

输出返回完整的新闻和链接列表,但只希望仅按首选日期和符合"COVID"等关键字的日期查看。感谢任何可以在这方面提供帮助的人!

此示例将打印包含最迟 2 天前发布的单词covid的文章:

import requests
from bs4 import BeautifulSoup
from collections import namedtuple
from datetime import datetime, timedelta, timezone
article = namedtuple('article', 'title desc pubdate')
url = 'https://www.channelnewsasia.com/rssfeeds/8396082'
soup = BeautifulSoup(requests.get(url).content, 'lxml')
articles = []
for title, desc, pubdate in zip(soup.select('item > title'),
soup.select('item > description'),
soup.select('item > pubdate')):
d = datetime.strptime(pubdate.get_text(strip=True), '%a, %d %b %Y %H:%M:%S %z')
articles.append(article(title.get_text(strip=True), desc.get_text(strip=True), d))
# example:
# print articles with keyword COVID published 2 days ago:
kw = 'covid'
now = datetime.now(timezone.utc)
two_days_ago = timedelta(days=2)
for a in articles:
if kw not in a.title.lower() and kw not in a.desc.lower():
continue
if now - a.pubdate > two_days_ago:
continue
print(a.title)
print(a.desc)
print(a.pubdate)
print('-' * 80)

指纹:

IMM, Clementi Mall, Tanglin Mall and four other locations visited by COVID-19 cases
SINGAPORE: IMM, Clementi Mall, Woodlands Mart, Woodlands North Plaza, Yuhua Village Market and Food Centre, Tanglin Mall and a Housing Development Board (HDB) block at 82 Marine Parade Central were on Monday (Jun 8) added to the list of public places visited by COVID-19 cases during their ...
2020-06-08 23:28:48+08:00
--------------------------------------------------------------------------------
SDP calls for period between Writ of Election and Nomination Day to be extended to 10 days
SINGAPORE: The Singapore Democratic Party (SDP) on Monday (Jun 8) called for the period between the issue of the Writ of Election to Nomination Day to be doubled, in a response to the Election Department's (ELD) release of contingency plans for holding a General Election during the COVID-19 ...
2020-06-08 21:50:38+08:00
--------------------------------------------------------------------------------
What are the COVID-19 safety measures for Polling Day? Here’s what voters need to know
SINGAPORE: The Elections Department (ELD) has issued contingency plans on how the next General Election (GE) will be held amid the COVID-19 outbreak.
Safety measures include dedicated time-bands for seniors to vote, setting up more polling stations and having voters wear gloves before entering ...
2020-06-08 20:20:38+08:00
--------------------------------------------------------------------------------
... and so on.

最新更新