Beautifulsoup/Writer导出为CSV时返回一个空单元格

我正在抓取一个网站，以获得姓名，出生+死亡日期，以及某人埋葬的墓地的名称。在大多数情况下，它运行得相当好;但是，当我将文本导出为CSV时，我注意到在每个页面后面的名称列中插入了一个空白单元格。我有一种感觉，这可能与循环有关，而不是html标签，但我仍在学习。欢迎任何建议!谢谢大家

下面是excel

中的一个问题示例

from dataclasses import replace
import requests
from bs4 import BeautifulSoup
import csv 
api = 'https://www.findagrave.com/memorial/search?'
name = 'firstname=&middlename=&lastname='
years = 'birthyear=&birthyearfilter=&deathyear=&deathyearfilter='
place = 'location=Yulee%2C+Nassau+County%2C+Florida%2C+United+States+of+America&locationId=city_28711'
memorialid = 'memorialid=&mcid='
linkname = 'linkedToName='
daterange = 'datefilter='
plotnum = 'orderby=r&plot='
page = 'page='
url = api + name + "&" + years + "&" + place + "&" + memorialid + "&" + linkname + "&" + daterange + "&" + plotnum + '&' + page

for page_no in range(1,93): 
url_final = url + str(page_no)
page = requests.get(url_final, headers = headers)
#print(page)
soup = BeautifulSoup(page.content, "html.parser")
graves = soup.find_all('div', {'class':'memorial-item py-1'})
#print(graves)

#Getting the Names 
grave_name = soup.find_all('h2', {'class':'name-grave'}) 
#Dates
dates = soup.find_all('b', {'class':'birthDeathDates'})
#Graveyard Name
grave_yard = soup.find_all('button', {'role': 'link'})
#print(grave_yard)
dataset = [(x.text, y.text, z.text) for x,y,z in zip(grave_name, dates, grave_yard)]
with open('Fernandiabeach3.csv', 'a',) as csvfile:
writer = csv.writer(csvfile)
writer.writerows(dataset)

我试着看看是否有类似的标签出现在每个新页面的开头，但我找不到任何突出的。

变化

with open('Fernandiabeach3.csv', 'a',) as csvfile:

with open('Fernandiabeach3.csv', 'a', newline='') as csvfile:

简单来说:

因为你没有定义新行应该是什么样子，所以它添加了

行专业术语:

csv。Writer模块直接控制行结束符并直接将rn写入文件。在Python 3中，文件必须以未翻译的文本模式打开，参数为'w'， newline= "(空字符串)，否则它将在Windows上写rrn，其中默认的文本模式会将每个n转换为rn。

希望这对你有帮助。快乐编码:)

相关内容

最新更新

热门标签：