所以基本上我将这部分添加到我的代码中,我不知道发生了什么。这是我正在使用的链接https://www.digikey.com/products/en?keywords=ID82C55在同一过程中:-所以我的css选择器返回none。-然后它找到几个html元素并返回其中的一些。-然后找到最后一个元素。
因此,这导致我的程序混合匹配数据,并错误地将其输出到我的csv文件中。如果有人能告诉我这里出了什么问题?谢谢
代码
def parse(self, response):
for b in response.css('div#pdp_content.product-details > div'):
if b.css('div.product-details-headline h1::text').get():
part = b.css('div.product-details-headline h1::text').get()
part = part.strip()
parts1 = part
print(b.css('div.product-details-headline h1::text').get())
print(parts1)
else:
print(b.css('div.product-details-headline h1::text').get())
if b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(3)::text').get():
cleaned_quantity = b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(3)::text').get()
print(cleaned_quantity)
else:
print(b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(3)::text').get())
if b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(2)::text').get():
cleaned_price = b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(2)::text').get()
print(cleaned_price)
else:
print(b.css('table.product-dollars > tr:nth-last-child(1) td:nth-last-child(2)::text').get())
if b.css('div.quantity-message span#dkQty::text').get():
cleaned_stock = b.css('div.quantity-message span#dkQty::text').get()
print(cleaned_stock)
else:
print(b.css('div.quantity-message span#dkQty::text').get())
if b.css('table#product-attribute-table > tr:nth-child(7) td::text').get():
status = b.css('table#product-attribute-table > tr:nth-child(7) td::text').get()
status = status.strip()
cleaned_status = status
print(cleaned_status)
else:
print(b.css('table#product-attribute-table > tr:nth-child(7) td::text').get())
# yield {
# 'Part': parts1,
# 'Quantity': cleaned_quantity,
# 'Price': cleaned_price,
# 'Stock': cleaned_stock,
# 'Status': cleaned_status,
# }
输出
None
None
None
None
None
None
2,500
29.10828
29
None
ID82C55A
ID82C55A
None
None
None
Active
我强烈建议您切换到XPath表达式:
part_number = b.xpath('.//th[.="Manufacturer Part Number"]/following-sibling::td[1]/text()').get()
stock = b.xpath('.//span[.="In Stock"]/preceding-sibling::span[1]/text()').get()
etc.