BeautifulSoup抓取二手车列表



我正在尝试制作一个程序,从网站上抓取二手车列表,并输出该汽车列表的链接、价格、里程和发动机功率。现在它只在第一个列表中重复自己。它应该输出页面上的每个清单。

网站是爱沙尼亚语的,我希望这不是问题。

import requests
from bs4 import BeautifulSoup
import unicodedata
url = 'https://www.auto24.ee/kasutatud/nimekiri.php?bn=2&a=100&b=7&ae=2&af=50&ssid=21570860'
page = requests.get(url)
soup = BeautifulSoup(page.text, 'lxml')
for div in soup.find_all('div', {'class' : 'result-row'}):
def getLink():
find_link = soup.find('a', {'class' : 'main'})
link = (find_link.get('href'))
link_string = ('https://www.auto24.ee' + link)
return link_string
def getPrice():
find_price = soup.find('span', {'class' : 'price'})
price = (find_price.get_text())
price_string = unicodedata.normalize("NFKD", price)
return price_string + ','
def getMileage():
find_mileage = soup.find('span', {'class' : 'mileage'})
mileage = (find_mileage.get_text())
return mileage + ','
def getPower():
engine = requests.get(getLink())
kW_string = 'kW'
engine_stats = BeautifulSoup(engine.text, 'lxml')
if engine_stats.find(kW_string) != -1:
power_find = engine_stats.find('tr', {'class' : 'field-mootorvoimsus'})
power = power_find.find('span', {'class' : 'value'})
power_string = power.get_text()
return power_string
else:
return ('Engine power not specified.')
print(getLink() + ',', getPrice(), getMileage(), getPower())

输出:

https://www.auto24.ee/soidukid/3554965, 1600 €, 174 000 km, 1.8
https://www.auto24.ee/soidukid/3554965, 1600 €, 174 000 km, 1.8
https://www.auto24.ee/soidukid/3554965, 1600 €, 174 000 km, 1.8
https://www.auto24.ee/soidukid/3554965, 1600 €, 174 000 km, 1.8

…等等。

使用代码:

import requests
from bs4 import BeautifulSoup
import unicodedata
url = "https://www.auto24.ee/kasutatud/nimekiri.php?bn=2&a=100&b=7&ae=2&af=50&ssid=21570860"
page = requests.get(url)
soup = BeautifulSoup(page.text, "lxml")
for div in soup.find_all("div", {"class": "result-row"}):
def getLink():
find_link = div.find("a", {"class": "main"})  # <-- use div.
link = find_link.get("href")
link_string = "https://www.auto24.ee" + link
return link_string
def getPrice():
find_price = div.find("span", {"class": "price"})  # <-- use div.
price = find_price.get_text()
price_string = unicodedata.normalize("NFKD", price)
return price_string + ","
def getMileage():
find_mileage = div.find("span", {"class": "mileage"})  # <-- use div.
if find_mileage:  # <-- check if mileage exists.
mileage = find_mileage.get_text()
else:
mileage = "N/A"
return mileage + ","
def getPower():
engine = requests.get(getLink())
kW_string = "kW"
engine_stats = BeautifulSoup(engine.text, "lxml")
if engine_stats.find(kW_string) != -1:
power_find = engine_stats.find(
"tr", {"class": "field-mootorvoimsus"}
)
power = power_find.find("span", {"class": "value"})
power_string = power.get_text()
return power_string
else:
return "Engine power not specified."
print(getLink() + ",", getPrice(), getMileage(), getPower())

打印:

https://www.auto24.ee/soidukid/3554965, 450 €, 174 000 km, 1.8
https://www.auto24.ee/soidukid/3563070, 450 €, 514 000 km, 1.9 (85 kW)
https://www.auto24.ee/soidukid/3564181, 500 €, 323 032 km, 1.6 (74 kW)
https://www.auto24.ee/soidukid/3563999, 500 €, 374 699 km, 1.8 (85 kW)
https://www.auto24.ee/soidukid/3550730, 500 €, 420 000 km, 2.0 (85 kW)
https://www.auto24.ee/soidukid/3550284, 500 €, 460 000 km, 2.0 (85 kW)
https://www.auto24.ee/soidukid/3564862, 525 €, 197 000 km, 1.6 (74 kW)
https://www.auto24.ee/soidukid/3554530, 525 €, N/A, 2.0 (96 kW)
https://www.auto24.ee/soidukid/3565657, 650 €, N/A, 2.0 (100 kW)
https://www.auto24.ee/soidukid/3562678, 650 €, 295 000 km, 1.8 (92 kW)
https://www.auto24.ee/soidukid/3403673, 650 €, N/A, 2.0 V 6 (66 kW)
https://www.auto24.ee/soidukid/3551361, 699 €, 230 000 km, 1.6 (74 kW)
https://www.auto24.ee/soidukid/3550824, 700 €, 223 000 km, 1.8 (85 kW)
https://www.auto24.ee/soidukid/3565138, 700 €, 282 740 km, 2.0 (85 kW)
https://www.auto24.ee/soidukid/3565734, 750 €, 330 000 km, 2.0 (107 kW)
https://www.auto24.ee/soidukid/3521637, 850 €, 282 000 km, 1.8 (85 kW)
https://www.auto24.ee/soidukid/3545388, 899 €, 416 000 km, 1.8 (85 kW)
https://www.auto24.ee/soidukid/3556481, 900 €, 270 000 km, 2.4 (66 kW)
https://www.auto24.ee/soidukid/3538068, 950 €, 170 300 km, 1.3 (44 kW)
https://www.auto24.ee/soidukid/3544829, 950 €, 426 000 km, 2.0 (66 kW)
https://www.auto24.ee/soidukid/3564562, 990 €, 478 800 km, 1.8 (66 kW)
https://www.auto24.ee/soidukid/1642523, 990 €, 173 123 km, 3.0 B/G (108 kW)
https://www.auto24.ee/soidukid/3559696, 999 €, 315 000 km, 2.0 (74 kW)
https://www.auto24.ee/soidukid/3555474, 1000 €, 67 848 km, 1.4 (51 kW)
https://www.auto24.ee/soidukid/3553789, 1000 €, 188 000 km, 1.6 (74 kW)
https://www.auto24.ee/soidukid/3554264, 1000 €, 235 000 km, 2.0 (81 kW)
https://www.auto24.ee/soidukid/3555325, 1000 €, N/A, 2.0 (85 kW)
https://www.auto24.ee/soidukid/3473855, 1000 €, 260 000 km, 2.0 TDI (85 kW)
https://www.auto24.ee/soidukid/3552728, 1000 €, 188 000 km, 2.0 (107 kW)
https://www.auto24.ee/soidukid/3526023, 1000 €, 206 000 km, 3.0 (115 kW)
https://www.auto24.ee/soidukid/3564811, 1099 €, N/A, 1.6 (80 kW)
https://www.auto24.ee/soidukid/3543709, 1099 €, 291 000 km, 1.8 TDCi (66 kW)
https://www.auto24.ee/soidukid/3546391, 1100 €, 394 514 km, 1.6 TDi (66 kW)
https://www.auto24.ee/soidukid/3561075, 1100 €, 252 100 km, 1.8 tdi (74 kW)
https://www.auto24.ee/soidukid/3565050, 1100 €, 235 000 km, 2.3 (107 kW)
https://www.auto24.ee/soidukid/2914584, 1100 €, N/A, 3.8 V6
https://www.auto24.ee/soidukid/3545888, 1180 €, 183 800 km, 1.6 (74 kW)
https://www.auto24.ee/soidukid/3564541, 1200 €, 170 000 km, 1.6 TDI (66 kW)
https://www.auto24.ee/soidukid/3564697, 1200 €, 128 000 km, 1.8 (85 kW)
https://www.auto24.ee/soidukid/3475104, 1200 €, 350 000 km, 2.3 (107 kW)
https://www.auto24.ee/soidukid/3565359, 1200 €, 320 000 km, 2.0 TDI (96 kW)
https://www.auto24.ee/soidukid/3517414, 1200 €, 295 000 km, 1.8 katalüsaator (92 kW)
https://www.auto24.ee/soidukid/3557519, 1290 €, 125 000 km, 1.3 (51 kW)
https://www.auto24.ee/soidukid/3529534, 1300 €, 199 500 km, 1.6 (74 kW)
https://www.auto24.ee/soidukid/3563109, 1320 €, 250 000 km, 1.8 (85 kW)
https://www.auto24.ee/soidukid/3511166, 1350 €, 161 400 km, 1.6 R4 (74 kW)
https://www.auto24.ee/soidukid/3537536, 1400 €, 401 779 km, 1.6 (85 kW)
https://www.auto24.ee/soidukid/3296861, 1450 €, 245 000 km, 2.5 (74 kW)
https://www.auto24.ee/soidukid/3556460, 1475 €, 170 133 km, 1.6 (74 kW)
https://www.auto24.ee/soidukid/3559093, 1490 €, 296 300 km, 2.0 (85 kW)

如果你看一下网页的URL URL也发生了变化所以我们可以使用ak=0ak=50等来获取网页的数据

import requests
from bs4 import BeautifulSoup
for i in range(0,150,50):
print(i)
res=requests.get(f"https://www.auto24.ee/kasutatud/nimekiri.php?bn=2&a=100&b=7&ae=2&af=50&ssid=21612624&ak={i}")
soup=BeautifulSoup(res.text,"html.parser")
main_data=soup.find("div",attrs={"id":"usedVehiclesSearchResult-flex"}).find_all("div",class_="description")
for i in main_data:
print(i.find("a",class_="main")['href'],end=" ")
print(i.find("span",class_="engine").get_text(),end=" ")
print(i.find("span",class_="price").get_text(),end=" ")
try:
print(i.find("span",class_="mileage").get_text())
except AttributeError:
print("NAN")

输出:

0
/soidukid/3554965 1.8 450 € 174 000 km
/soidukid/3563070 1.9 85kW 450 € 514 000 km
/soidukid/3564181 1.6 74kW 500 € 323 032 km
/soidukid/3563999 1.8 85kW 500 € 374 699 km
/soidukid/3550730 2.0 85kW 500 € 420 000 km
..

相关内容

  • 没有找到相关文章

最新更新