如何将新闻网络抓取提取到csv文件中以及如何附加新记录?



python的新手,并构建了一个网络爬虫来从CNN头条上提取新的新闻文章。尝试获取当我打印((时看起来像逐行项目的输出。希望将结果提取到 csv 文件中,以便每个标题都是自己的行。此外,为了能够编写附加版本,以便每次我运行它时,它都会附加到文件而不是覆盖它。问题是我如何让结果在 csv 文件中看起来像这样:

1( 标题 1 来自抓取的数据 2( 标题 2 来自抓取的数据 3( 标题 3 来自抓取的数据 等等。

我在下面粘贴了我的代码:

from bs4 import BeautifulSoup
import requests
import csv
#nterwebsite you wish to pull from that has news articles
res = requests.get('http://money.cnn.com/')
soup = BeautifulSoup(res.text, 'lxml')
#need to pul the ulcode from the website by right clicking and choosing inspecting element
news_box = soup.find('ul', {'class': '_6322dd28 ad271c3f'})
#drill down into the li's as they should always show a, which signals the header for the news article shown.
all_news = news_box.find_all('a')
for news in all_news:
test=  (news.text)
print(test)
with open('index.csv', 'w') as fobj:
csvwriter = csv.writer(fobj, delimiter=',')
for row in test:
csvwriter.writerow(test)

您可以将re.compileBeautifulSoup.find_all一起使用:

from bs4 import BeautifulSoup as soup
import requests, re
import csv
d = soup(requests.get('http://money.cnn.com/').text, 'html.parser')
articles = list(filter(None, [i.text for i in d.find_all('span', {'class':re.compile('^w+ _w+|^w+$')})]))[2:]
with open('articles.csv', 'a') as f:
write = csv.writer(f)
write.writerows([[i] for i in articles])

输出:

What higher wages means for Domino's and McDonald's  
'Jurassic World' sequel has big opening day amid a surging box office 
Crying migrant girl: What the iconic photo says about press access 
Chanel reveals earnings for the first time in its 108-year history 
Why GE may need to stop paying its 119-year old dividend 
A top Netflix executive is out after using the N-word 
ZTE pays $1 billion fine to US over sanctions violations 
Tariffs on European cars would hurt US auto jobs 
Etsy sellers confront unknowns after Supreme Court ruling 
Chipotle hopes quesadillas and milkshakes bring customers back 
This group is getting ahead in America 
OPEC strikes deal to increase oil production 
Wall Street banks are healthier than ever 
Self-driving Uber driver may have been streaming 'The Voice' 
GM's new Chevy Blazer will be built in Mexico 
"GM is bringing back the Chevy Blazer, an SUV classic "
...

相关内容

最新更新