使用 Python Newspaper3k 获取 article.text 的 URL



>我正在尝试从这个网站上获取全国新闻。以下是我的代码。

from newspaper import Article
url = 'https://www.stuff.co.nz/national'
article = Article(url)
article.download()
article.parse()
data = article.text
data.splitlines()

除了文本标题之外,我还需要获取这些标题的网址。例如:

Fish sausages recalled People with egg allergies or an intolerance should not consume these products, MPI says
https://www.stuff.co.nz/business/118673724/fish-sausages-sold-in-auckland-and-hamilton-recalled-due-to-egg-allergy-risk

试试这个方式:

import newspaper
url = 'https://www.stuff.co.nz/national'
paper = newspaper.build(url)
for article in paper.articles:
target = article.title
if target and len(target.strip())>0:
print(target.strip().replace('n',''))
print(article.url)

输出:

Is a text worth a life?
https://www.stuff.co.nz/national/118315357/is-a-text-worth-a-life-a-message-to-kiwi-drivers-on-the-roads-this-holiday-season
Courtenay Place 'going backwards'
https://www.stuff.co.nz/national/118660168/wellington-police-commander-says-rowdy-courtenay-place-going-backwards
New frontier for energy?
https://www.stuff.co.nz/national/117860027/searching-for-new-zealands-electricity-future-in-the-deep-heat

等。

最新更新