对于循环迭代没有预期的影响



我有下面的代码,它可以抓取一个网站,并将结果写入csv文件。问题是 for 循环出于某种原因打印每个迭代的多个副本,其中它应该只写入每个迭代一次。有人可以提供帮助并指出我在这里缺少什么吗? 谢谢

import requests
from bs4 import BeautifulSoup
import csv
url = 'https://online.computicket.com'
home_page = requests.get(url)
home_page.content
soup = BeautifulSoup(home_page.content, 'lxml')

links = soup.find_all('a', {'class':'info'})
next_link = []
for link in links:
next_link.append(link.get("href"))

for i in range(0, len(next_link),1):    
next_link.append(i)
print(url + next_link[i])
new_url = requests.get(url + next_link[i])   
for link in (url + next_link[i]):
new_url.content
soup = BeautifulSoup(new_url.content, 'lxml')
info_name = soup.find('div', {'class' : 'es-cost'}) 
heading = soup.find('h1',{'class' : 'full'})
with open('Don.csv', 'a') as csv_file:
#csv_file.write(heading.get_text())
for name in soup.find_all('div', {'class' : 'es-cost'}):
csv_file.write(heading.get_text())
csv_file.write(name.get_text())
print(name.get_text())

我认为您的程序由于嵌套的循环而打印多个副本。但是,它的link变量不会在循环内的任何地方使用。尝试删除嵌套的 for 语句,替换这部分代码:

for i in range(0, len(next_link),1):    
next_link.append(i)
print(url + next_link[i])
new_url = requests.get(url + next_link[i])   
for link in (url + next_link[i]):
new_url.content
soup = BeautifulSoup(new_url.content, 'lxml')
info_name = soup.find('div', {'class' : 'es-cost'}) 
heading = soup.find('h1',{'class' : 'full'})
with open('Don.csv', 'a') as csv_file:
#csv_file.write(heading.get_text())
for name in soup.find_all('div', {'class' : 'es-cost'}):
csv_file.write(heading.get_text())
csv_file.write(name.get_text())
print(name.get_text())

有了这个

for i in range(0, len(next_link),1):    
next_link.append(i)
print(url + next_link[i])
new_url = requests.get(url + next_link[i])   
new_url.content
soup = BeautifulSoup(new_url.content, 'lxml')
info_name = soup.find('div', {'class' : 'es-cost'}) 
heading = soup.find('h1',{'class' : 'full'})
with open('Don.csv', 'a') as csv_file:
#csv_file.write(heading.get_text())
for name in soup.find_all('div', {'class' : 'es-cost'}):
csv_file.write(heading.get_text())
csv_file.write(name.get_text())
print(name.get_text())

最新更新