美丽汤/刮刀问题,文本存在时没有文本,不会在页面之间移动



我试图写一个小的抓取项目,只是为了更多地了解整个事情和Python的一般知识,但是我遇到了一些问题,尽管我尽了最大的努力,但我似乎无法解决。这样做的目的是查看我的愿望清单,并生成一个CSV文件,如果有库存,我将在Excel中与主列表进行状态更改。下面是我的代码:

import requests
from requests import get
from bs4 import BeautifulSoup
import pandas as pd
import numpy as np
from time import sleep
from random import randint
headers = {"Accept-Language": "en-US, en;q=0.5"}
titles = []
links = []
price = []
addtocart = []
pages = np.arange(1, 10, 1)
for page in pages:
page = requests.get("https://www.instocktrades.com/wishlists/defc57d9758f4ba89683abbc7a3d93?pg=" + str(pages), headers=headers)

soup = BeautifulSoup(page.text, "html.parser")
wishlist_div = soup.find_all('div', class_='item thumbplus')
sleep(randint(2,10))
for container in wishlist_div:
#name
name = container.find('div', class_='title').text.strip()
titles.append(name)

#link
link = container.find('div', attrs={'class' : 'title'})
for div in link:
linking = container.find('a')['href']
link = "https://www.instocktrades.com" + linking
links.append(link)

#price
pricing = container.find('div', class_='price')
price.append(pricing)

#addtocart
cart = container.find('button', class_='btn addtocart') if container.find('button', class_='btn addtocart') else 'Out Of Stock'
addtocart.append(cart)
#building Pandas dataframe         
wishlist = pd.DataFrame({
'book': titles,
'link': links,
'price': price,
'cart': addtocart
})
wishlist.to_csv('wishlist.csv')
print(wishlist)

我遇到的问题如下:

  1. 它不会移动到网站的下一页,我认为我已经设置正确,但它似乎不想做任何事情,除了第一页。
  2. 对于价格,如果我添加。text,我会收到一个属性错误:'NoneType'对象没有属性'text',但将其保留原样,将所有html提取到CSV中,我真的很想获得27.99美元:
<div class="price">
$27.99
</div>
  1. 对于购物车部分,查看是否存在添加到购物车按钮的点显然告诉我它是否有库存。再一次,如果我尝试添加。text,我得到另一个属性错误,没有文本。如果我保持原样,它会像下面引用的那样再次放置按钮的整个html代码。我想要实现的是,如果添加到购物车按钮存在,则返回值为"In stock",如果不存在,则会写入"Out of stock"。
<button class="btn addtocart" data-cart-qty="0" data-code="MAR201512" data-id="66791" data-title="A Walk Through Hell Complete HC (C: 0-1-0)" data-wl="3851484" title="Add to Cart" type="button">
<img alt="Add to Cart" src="/images/cart.png"/> Add to Cart
</button>

如果我能得到任何帮助来纠正这些问题,我将非常感激。谢谢你!

在你的价格块中使用这个。只需搜索class_='price'。问题是有些书没有标价。

pricing = container.find('div', class_='price')
if pricing:
price.append(pricing.text)
print(pricing.text)
else:
print('no pricing')
price.append(0)

部分输出:

https://www.instocktrades.com/products/mar201512/a-walk-through-hell-complete-hc-(c-0-1-0)
$27.99

https://www.instocktrades.com/products/jul170097/abe-sapien-dark-terrible-hc-vol-01
no pricing
https://www.instocktrades.com/products/nov170018/abe-sapien-dark-terrible-hc-vol-02
no pricing
https://www.instocktrades.com/products/mar180092/abe-sapien-drowning-other-stories-hc
no pricing
https://www.instocktrades.com/products/may110255/absolute-all-star-superman-hc
no pricing
https://www.instocktrades.com/products/dec180616/absolute-batman-arkham-asylum-hc-30th-anniv-ed
no pricing
https://www.instocktrades.com/products/apr150293/absolute-batman-the-court-of-owls-hc
no pricing
https://www.instocktrades.com/products/aug180594/absolute-batman-the-black-mirror-hc
no pricing
https://www.instocktrades.com/products/feb201046/absolute-carnage-omnibus-hc
no pricing
https://www.instocktrades.com/products/may190468/absolute-death-hc-new-ed-(mr)
no pricing
https://www.instocktrades.com/products/aug190641/absolute-fourth-world-by-jack-kirby-hc-vol-01
no pricing
https://www.instocktrades.com/products/jan160353/absolute-preacher-hc-vol-01-(mr)
no pricing
https://www.instocktrades.com/products/nov160355/absolute-preacher-hc-vol-02-(mr)
no pricing
https://www.instocktrades.com/products/sep170442/absolute-preacher-hc-vol-03-(mr)
$87.00

https://www.instocktrades.com/products/jul108195/absolute-sandman-vol-1-hc-(mr)
$57.99

最新更新