Scrapy在shell中得到结果，但在脚本中没有

又是一个主题^^根据这里的建议，我已经实现了我的机器人程序，并在shell中进行了测试：

name_list = response.css("h2.label.title::text").extract()
packaging_list = response.css("div.label.packaging::text").extract()
ean = response.css("h1.page-title::text").extract_first()
product_price = ''.join(response.css('.product-pricing__main-price  ::text').extract())
company = "carrefour"
for name, packaging, price in zip(name_list, packaging_list, product_price):
item = ScrapybotItem()
item['ean'] = ean
item['desc'] = name.replace("n","").strip() + " " +  packaging
item['price'] = price
item['company'] = company
yield item

问题在于价格字段。

对于外壳中的价格，我有例如：

In [2]: product_price
Out[2]: 'n                    5,65€nn  n      '

同一产品的脚本输出：

{'company': 'carrefour',
'desc': "Gel nettoyant anti-imperfections 5 en 1 L'Oréal Paris Men Expert 
le "
'tube de 150ml',
'ean': 'n  1 résultat pour « 3600522418634 »n',
'price': 'n'}

你知道为什么我不能得到脚本的价格结果吗？

product_price是一个字符串，假设您将选择器的结果连接到：中

product_price = ''.join(response.css('.product-pricing__main-price  ::text').extract())

然后，当您使用zip时，您将把该字符串拆分为多个部分，因此您将为第一个项使用n，因为它可能是product_price中的第一个字符。

检查此示例：

>>> for i, j, k in zip([1, 2, 3, 4], [5, 6, 7, 8], 'abcd'):
print (i, j, k)

输出：

相关内容

最新更新

热门标签：