Scrapy:项目加载程序和键错误,即使定义了键



意图/预期行为

从页面返回链接的文本: https://www.bezrealitky.cz/vypis/nabidka-prodej/byt/praha

采用 CSV 格式和外壳格式。

错误

我收到一个KeyError:"title",即使我已经在 item.py 项目加载器中定义了密钥。

完整回溯

Traceback (most recent call last):
  File "C:UsersphiliAnaconda3envspy35libsite-packagesscrapyutilsdefer.py", line 102, in iter_errback
    yield next(it)
  File "C:UsersphiliAnaconda3envspy35libsite-packagesscrapyspidermiddlewaresoffsite.py", line 29, in process_spider_output
    for x in result:
  File "C:UsersphiliAnaconda3envspy35libsite-packagesscrapyspidermiddlewaresreferer.py", line 22, in <genexpr>
    return (_set_referer(r) for r in result or ())
  File "C:UsersphiliAnaconda3envspy35libsite-packagesscrapyspidermiddlewaresurllength.py", line 37, in <genexpr>
    return (r for r in result or () if _filter(r))
  File "C:UsersphiliAnaconda3envspy35libsite-packagesscrapyspidermiddlewaresdepth.py", line 58, in <genexpr>
    return (r for r in result or () if _filter(r))
  File "C:UsersphiliDocumentsPython ScriptsScrapy Spidersbezrealitkybezrealitkyspidersbezrealitky_spider.py", line 33, in parse
    yield loader.load_item()
  File "C:UsersphiliAnaconda3envspy35libsite-packagesscrapyloader__init__.py", line 115, in load_item
    value = self.get_output_value(field_name)
  File "C:UsersphiliAnaconda3envspy35libsite-packagesscrapyloader__init__.py", line 122, in get_output_value
    proc = self.get_output_processor(field_name)
  File "C:UsersphiliAnaconda3envspy35libsite-packagesscrapyloader__init__.py", line 144, in get_output_processor
    self.default_output_processor)
  File "C:UsersphiliAnaconda3envspy35libsite-packagesscrapyloader__init__.py", line 154, in _get_item_field_attr
    value = self.item.fields[field_name].get(key, default)
KeyError: 'title'

Spider.py

def parse(self, response):

for records in response.xpath('//*[starts-with(@class,"record")]'):
    loader = BaseItemLoader(selector=records)
    loader.add_xpath('title', './/div[@class="details"]/h2/a[@href]/text()')
    yield loader.load_item()

Item.py - 条目加载器

class BaseItemLoader(ItemLoader):
    title_in = MapCompose(unidecode)

结论

我有点不知所措,因为我想我遵循了 Scrapy 手册并通过"title_in"定义了项目加载器和键,但是当我将值交给它时,我得到了 KeyError。我在 shell 中检查 Xpath 是否提供了我想要的文本,所以至少这是有效的。希望得到一些帮助!

即使你使用ItemLoader,你也应该先定义Item类,然后将其传递给项目加载器,或者将其定义为加载器的属性:

class CustomItemLoader(ItemLoader):
    default_item_class = MyItem

或将其实例传递给加载器的构造函数:

l = CustomItemLoader(item=Item())

否则,项目加载器对项目及其字段一无所知。

最新更新