Python:属性错误:'bytes'对象没有属性'find_all'



只是一直在测试试图让网络抓取工作,但这困扰着我。这是我的代码。

import requests
from bs4 import BeautifulSoup
from csv import writer
page = requests.get('https://www.ebay.com/sch/i.html?_from=R40&_trksid=m570.l1313&_nkw=pc&_sacat=0')
soup = BeautifulSoup(page.text,'html.parser').encode("utf-8")
posts = soup.find_all(class_='s-item__wrapper clearfix')
with open('ebay.csv', 'w') as csv_file:
csv_writer = writer(csv_file)
headers = ['Title', 'Price', 'Link']
csv_writer.writerow(headers)
for post in posts:
price = post.find(class_='s-item__price').get_text().replace('n', '')
title = post.find(class_='s-item__title').get_text().replace('n', '')
link = post.find('a')['href']
csv_writer.writerow([title, price, link])

我不断收到此错误

Traceback (most recent call last):
File "path/WebScraping.py", line 8, in <module>
posts = soup.find_all(class_='s-item__wrapper clearfix')
AttributeError: 'bytes' object has no attribute 'find_all'

试图找到其他解决方案,但找不到任何适合我的解决方案。代码有效,但仅适用于页面的三分之一。

您正在尝试对生成该对象的bytes对象表示形式的soup对象进行编码bytes并且该对象中没有名为find_all的方法。

取代:

soup = BeautifulSoup(page.text,'html.parser').encode("utf-8")

跟:

soup = BeautifulSoup(page.text,'html.parser')
import requests
from bs4 import BeautifulSoup
import csv

def Main(url):
r = requests.get(url)
soup = BeautifulSoup(r.content, 'html.parser')
names = [item.text for item in soup.findAll(
"h3", class_="s-item__title")]
prices = [item.text for item in soup.findAll(
"span", class_="s-item__price")]
urls = [item.get("href")
for item in soup.findAll("a", class_="s-item__link")]
with open("result.csv", 'w', newline="", encoding="UTF-8") as f:
wrriter = csv.writer(f)
wrriter.writerow(["Name", "Price", "Url"])
data = zip(names, prices, urls)
wrriter.writerows(data)

Main("https://www.ebay.com/sch/i.html?_from=R40&_trksid=m570.l1313&_nkw=pc&_sacat=0")

输出:在线查看

注意:您使用的是soup.find_all(class_='s-item__wrapper clearfix')而不是soup.find_all("div", class_='s-item__wrapper clearfix')

这就是您收到AttributeError的原因

关于UnicodeEncodeError,我相信您正在使用默认编码为cp1252Windows,将无法读取Unicode例如.

尝试:

print(u2714)

输出应该是:✔但你的系统会给出UnicodeEncodeError

因此,您有2个选项,或者更改系统locale设置,

打开cmd.exe然后键入chcp,我相信你会得到437这是United States

现在您需要将其更改为chcp 65001这是UTF-8支持。 参考资料。

或者使用utf-8encoding="UTF-8"csv进行编码

最新更新