HTML vs response.url - How to grab the price with xpath - HTML vs response.url - How to grab the price with xpath 小贝子编程网

这就是我在浏览器中看到的HTML

   <li class="lvprice prc">
            <span  class="bold">    
                    $72.95</span>
                </li>

因此，我的XPath以获取价格;

prices = Selector(response).xpath('//*[@class = "lvprice prc"]')
for price in prices:
    item['price'] = float(price.xpath('span[1]/text()').extract()[0].strip()[1:].replace(',',''))

它对某些URL不起作用，因此我查看了对它不起作用的反应。响应看起来像这样;

<li class="lvprice prc">
        <span  class="bold">
                <b>ZAR</b> 2,656.74</span>
            </li>

任何建议如何处理此建议

谢谢！（域是ebay.com）

在执行Ajax之前，这些价格真的在DOM中吗？

也许价格正在加载Ajax Call。

尝试在您的浏览器中禁用JS，然后查看该页面中的HTML。

顺便说一句，请使用此XPath选择与一个类的更多元素一起选择元素。

//*[contains(@class, 'class1') and contains(@class, 'class2')]

编辑：

我100％确定eBay由于来自同一IP的许多请求而显示验证码页面。请记住，eBay不是婴儿网站，它是一家非常大的公司，并且反对刮擦。他们阻止了刮擦它们的IP。

我还曾经刮过亚马逊，eBay和其他几个大网站，它们确实反对爬行。

这样做是为了查看价格不在您的响应中时的响应。

from scrapy.utils.response import open_in_browser
def parse_details(self, response):
    try:
        Selector(response).xpath('//*[@class = "lvprice prc"]').extract()[0]
    except Exception:
        open_in_browser(response)

这将打开操作系统默认浏览器中的刮擦页面。

HTML vs response.url - How to grab the price with xpath

相关内容

最新更新

热门标签：