刮板仅将数据从上一个URL输出到CSV



我是Python的新手,并尝试通过进行小型小项目来学习。我目前正在尝试从各个网页收集一些信息,但是,每当将刮擦数据输出到CSV时,它似乎只会从上一个URL输出数据。

理想情况下,我希望它能够写信给CSV反对附加的CSV,因为我只想要一个CSV,只有最新数据中的最新数据。

我在stackoverflow上浏览了与此相似的其他查询,但我要么不了解它们,要么他们只是对我不起作用。(可能是前者(。

任何帮助将不胜感激。

import csv
import requests
from bs4 import BeautifulSoup
import pandas as pd
URL = ['URL1','URL2']
for URL in URL:
    response = requests.get(URL)
    soup = BeautifulSoup(response.content, 'html.parser')
    nameElement = soup.find('p', attrs={'class':'name'}).a
    nameText = nameElement.text.strip()
    priceElement = soup.find('span', attrs={'class':'price'})
    priceText = priceElement.text.strip()

columns = [['Name','Price'], [nameText, priceText]]

with open('index.csv', 'w', newline='') as csv_file:
    writer = csv.writer(csv_file)
    writer.writerows(columns)

您必须在for循环之前打开文件,然后在for LOOP中写下每一行

URL = ['URL1','URL2']
with open('index.csv', 'w', newline='') as csv_file:
    writer = csv.writer(csv_file)
    writer.writerow( ['Name','Price'] )
    for URL in URL:
        response = requests.get(URL)
        soup = BeautifulSoup(response.content, 'html.parser')
        nameElement = soup.find('p', attrs={'class':'name'}).a
        nameText = nameElement.text.strip()
        priceElement = soup.find('span', attrs={'class':'price'})
        priceText = priceElement.text.strip()
        writer.writerow( [nameText, priceText] )

,或者您必须在for循环之前创建列表,然后append()数据列表

URL = ['URL1','URL2']
columns = [ ['Name','Price'] ]
for URL in URL:
    response = requests.get(URL)
    soup = BeautifulSoup(response.content, 'html.parser')
    nameElement = soup.find('p', attrs={'class':'name'}).a
    nameText = nameElement.text.strip()
    priceElement = soup.find('span', attrs={'class':'price'})
    priceText = priceElement.text.strip()
    columns.append( [nameText, priceText] )
with open('index.csv', 'w', newline='') as csv_file:
    writer = csv.writer(csv_file)
    writer.writerows(columns)

最新更新