我有一个函数,它需要在scraby 中用不同的request.meta运行两次
request = scrapy.Request(tournament_url, callback=self.parse_tournament)
request.meta['data'] = team1_data
yield request
request1 = scrapy.Request(tournament_url, callback=self.parse_tournament)
request1.meta['data'] = team2_data
yield request1
截至目前,只有第一个请求有效!
您将希望在第二个Request
中包含dont_filter
,以避免Scrapy DupeFilter丢弃已看到的URL:
request1 = scrapy.Request(tournament_url, callback=self.parse_tournament,
dont_filter=True)
request1.meta['data'] = team2_data
yield request