尝试访问表数据时返回 null 的函数

我正在尝试使用Scrapy抓取网站以获取一些数据。我使用 css 找到了该表，但它只返回线程数据。

也尝试使用xpath，但这也没有帮助。实际上，代码没有 tbody 标记，因为它函数返回 null。

我正在尝试抓取此网站

def parse(self, response):
        table = response.css('div.iw_component div.mobile-collapse div.fund-component div#exposureTabs div.component-tabs-panel div.table-chart-container div.fund-component table#tabsSectorDataTable')
        print(table.extract())

我想访问所选表中的数据，该表存在于 tbody 标签中。

你正在寻找的数据是使用Javascript动态加载的，这就是Scrapy找不到它的原因。您可以尝试使用Scrapy-Splash或自己解析它：

import json
def parse(self, response):
    table_json = response.xpath('//script[contains(.,  "var tabsSectorDataTable =")]/text()').re_first(r'var tabsSectorDataTable =(.+?]);')
    table = json.loads(table_json)

相关内容

最新更新

热门标签：