尊敬的Stackoverflow社区,
最近我开始玩Python。我通过观看YouTube视频和浏览这个平台学到了很多。但是我不能解决我的问题。
希望你们能帮我。
所以我试着用Python(Anaconda(从网站上抓取信息。并将这些信息放入CSV文件中。我试图通过在脚本中添加","来分隔列。但当我打开我的CSV文件时,所有数据都放在一列(A(中。相反,我希望数据被分隔在不同的列中(A&B(当我想添加信息时,还有C、D、E、F等(。
我必须在这个代码中添加什么:
filename = "brands.csv"
f = open(filename, "w")
headers = "brand, shippingn"
f.write(headers)
for container in containers:
brand_container = container.findAll("h2",{"class":"product-name"})
brand = brand_container[0].a.text
shipping_container = container.findAll("p",{"class":"availability in-stock"})
shipping = shipping_container[0].text.strip()
print("brand: " + brand)
print("shipping: " + shipping)
f.write(brand + "," + shipping + "," + "n")
f.close()
谢谢你的帮助!
问候,
根据Game0ver的建议完成脚本:
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
my_url = 'https://www.scraped-website.com'
# opening up connection, grabbing the page
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
# html parsing
page_soup = soup(page_html, "html.parser")
# grabs each product
containers = page_soup.findAll("li",{"class":"item last"})
container = containers[0]
import csv
filename = "brands.csv"
with open(filename, 'w') as csvfile:
fieldnames = ['brand', 'shipping']
# define your delimiter
writer = csv.DictWriter(csvfile, delimiter=',', fieldnames=fieldnames)
writer.writeheader()
for container in containers:
brand_container = container.findAll("h2",{"class":"product-name"})
brand = brand_container[0].a.text
shipping_container = container.findAll("p",{"class":"availability in-stock"})
shipping = shipping_container[0].text.strip()
print("brand: " + brand)
print("shipping: " + shipping)
正如我提到的,这个代码不起作用。我一定做错了什么?
您最好使用python的csv模块来实现这一点:
import csv
filename = "brands.csv"
with open(filename, 'w') as csvfile:
fieldnames = ['brand', 'shipping']
# define your delimiter
writer = csv.DictWriter(csvfile, delimiter=',', fieldnames=fieldnames)
writer.writeheader()
# write rows...
试着用双引号括起你的值,比如
f.write('"'+brand + '","' + shipping + '"n')
尽管如此,还是有更多更好的方法来处理此通用任务和此功能。
您可以选择下面显示的任何一种方式。由于无法访问脚本中可用的url,我提供了一个有效的url。
import csv
import requests
from bs4 import BeautifulSoup
url = "https://yts.am/browse-movies"
response = requests.get(url)
soup = BeautifulSoup(response.content, 'lxml')
with open("movieinfo.csv", 'w', newline="") as f:
writer = csv.DictWriter(f, ['name', 'year'])
writer.writeheader()
for row in soup.select(".browse-movie-bottom"):
d = {}
d['name'] = row.select_one(".browse-movie-title").text
d['year'] = row.select_one(".browse-movie-year").text
writer.writerow(d)
或者你可以尝试如下:
soup = BeautifulSoup(response.content, 'lxml')
with open("movieinfo.csv", 'w', newline="") as f:
writer = csv.writer(f)
writer.writerow(['name','year'])
for row in soup.select(".browse-movie-bottom"):
name = row.select_one(".browse-movie-title").text
year = row.select_one(".browse-movie-year").text
writer.writerow([name,year])