几天后,我一直在尝试创建一个刮擦者,每个项目都会遇到相同的错误:spider not found
。无论我做出什么变化或我遵循哪个教程,它总是返回相同的错误。
有人可以建议我在哪里寻找错误?
谢谢!
Windows 10,Python 2.7
C:.
│ scrapy.cfg
│
└───scrapscrapy
│ items.py
│ middlewares.py
│ pipelines.py
│ settings.py
│ settings.pyc
│ __init__.py
│ __init__.pyc
│
└───spiders
SSSpider.py
SSSpider.pyc
items.py
from scrapy.item import Item, Field
class ScrapscrapyItem(Item):
# define the fields for your item here like:
# name = scrapy.Field()
Heading = Field()
Content = Field()
Source_Website = Field()
pass
ssspider.py
from scrapy.selector import Selector
from scrapy.spider import Spider
from Scrapscrapy.items import ScrapscrapyItem
class ScrapscrapySpider(Spider):
name="ss"
allowed_domains = ["yellowpages.md/rom/companies/info/2683-intelsmdv-srl"]
start_url = ['http://yellowpages.md/rom/companies/info/2683-intelsmdv-srl/']
def parse(self, response) :
sel = Selector (response)
item = ScrapscrapyItem()
item['Heading']=sel.xpath('/html/body/div[2]/div[2]/div/div/div/div/div[1]/div/div[2]/div/article/div/div[1]/div[2]/h2').extract
item['Content']=sel.xpath('/html/body/div[2]/div[2]/div/div/div/div/div[1]/div/div[2]/div/article/div/div[1]/div[2]/div[2]/div/div[2]/div/div[1]/div[1]').extract
item['Source_Website']= 'yellowpages.md/rom/companies/info/2683-intelsmdv-srl'
return item
设置
BOT_NAME = 'scrapscrapy'
SPIDER_MODULES = ['scrapscrapy.spiders']
NEWSPIDER_MODULE = 'scrapscrapy.spiders'
# Crawl responsibly by identifying yourself (and your website) on the user-agent
#USER_AGENT = 'scrapscrapy (+http://www.yourdomain.com)'
# Obey robots.txt rules
ROBOTSTXT_OBEY = True
命令行:
C:UsersnasteaDesktopscrapscrapy>scrapy crawl ss
c:python27libsite-packagesscrapy-1.3.2-py2.7.eggscrapyspiderloader.py:37: RuntimeWarning:
Traceback (most recent call last):
File "c:python27libsite-packagesscrapy-1.3.2-py2.7.eggscrapyspiderloader.py", line 31, in _load_all_spiders
for module in walk_modules(name):
File "c:python27libsite-packagesscrapy-1.3.2-py2.7.eggscrapyutilsmisc.py", line 63, in walk_modules
mod = import_module(path)
File "c:python27libimportlib__init__.py", line 37, in import_module
__import__(name)
ImportError: No module named spiders
Could not load spiders from module 'scrapscrapy.spiders'. Check SPIDER_MODULES setting
warnings.warn(msg, RuntimeWarning)
2017-02-19 14:21:16 [scrapy.utils.log] INFO: Scrapy 1.3.2 started (bot: scrapscrapy)
2017-02-19 14:21:16 [scrapy.utils.log] INFO: Overridden settings: {'NEWSPIDER_MODULE': 'scrapscrapy.spiders', 'SPIDER_MODULES': ['scrapscrapy.spiders'], 'ROBOTSTXT_OBEY': True, 'BOT_NAME': 'scrapscrapy'}
Traceback (most recent call last):
File "c:python27Scriptsscrapy-script.py", line 11, in <module>
load_entry_point('scrapy==1.3.2', 'console_scripts', 'scrapy')()
File "c:python27libsite-packagesscrapy-1.3.2-py2.7.eggscrapycmdline.py", line 142, in execute
_run_print_help(parser, _run_command, cmd, args, opts)
File "c:python27libsite-packagesscrapy-1.3.2-py2.7.eggscrapycmdline.py", line 88, in _run_print_help
func(*a, **kw)
File "c:python27libsite-packagesscrapy-1.3.2-py2.7.eggscrapycmdline.py", line 149, in _run_command
cmd.run(args, opts)
File "c:python27libsite-packagesscrapy-1.3.2-py2.7.eggscrapycommandscrawl.py", line 57, in run
self.crawler_process.crawl(spname, **opts.spargs)
File "c:python27libsite-packagesscrapy-1.3.2-py2.7.eggscrapycrawler.py", line 162, in crawl
crawler = self.create_crawler(crawler_or_spidercls)
File "c:python27libsite-packagesscrapy-1.3.2-py2.7.eggscrapycrawler.py", line 190, in create_crawler
return self._create_crawler(crawler_or_spidercls)
File "c:python27libsite-packagesscrapy-1.3.2-py2.7.eggscrapycrawler.py", line 194, in _create_crawler
spidercls = self.spider_loader.load(spidercls)
File "c:python27libsite-packagesscrapy-1.3.2-py2.7.eggscrapyspiderloader.py", line 51, in load
raise KeyError("Spider not found: {}".format(spider_name))
KeyError: 'Spider not found: ss'
编辑
正如Elrull建议的那样,我在spider
文件夹中添加了_init_.py
文件,也将其更改为scrapy。现在结果CMD返回是:
C:UsersnasteaDesktopscrapscrapy>scrapy crawl ss
c:python27libsite-packagesscrapy-1.3.2-py2.7.eggscrapyspiderloader.py:37: RuntimeWarning:
Traceback (most recent call last):
File "c:python27libsite-packagesscrapy-1.3.2-py2.7.eggscrapyspiderloader.py", line 31, in _load_all_spiders
for module in walk_modules(name):
File "c:python27libsite-packagesscrapy-1.3.2-py2.7.eggscrapyutilsmisc.py", line 71, in walk_modules
submod = import_module(fullpath)
File "c:python27libimportlib__init__.py", line 37, in import_module
__import__(name)
File "C:UsersnasteaDesktopscrapscrapyscrapscrapyspidersSSSpider.py", line 3, in <module>
from Scrapscrapy.items import ScrapscrapyItem
ImportError: No module named Scrapscrapy.items
Could not load spiders from module 'scrapscrapy.spiders'. Check SPIDER_MODULES setting
warnings.warn(msg, RuntimeWarning)
2017-02-19 15:13:36 [scrapy.utils.log] INFO: Scrapy 1.3.2 started (bot: scrapscrapy)
2017-02-19 15:13:36 [scrapy.utils.log] INFO: Overridden settings: {'NEWSPIDER_MODULE': 'scrapscrapy.spiders', 'SPIDER_MODULES': ['scrapscrapy.spiders'], 'ROBOTSTXT_OBEY': True, 'BOT_NAME': 'scrapscrapy'}
Traceback (most recent call last):
File "c:python27Scriptsscrapy-script.py", line 11, in <module>
load_entry_point('scrapy==1.3.2', 'console_scripts', 'scrapy')()
File "c:python27libsite-packagesscrapy-1.3.2-py2.7.eggscrapycmdline.py", line 142, in execute
_run_print_help(parser, _run_command, cmd, args, opts)
File "c:python27libsite-packagesscrapy-1.3.2-py2.7.eggscrapycmdline.py", line 88, in _run_print_help
func(*a, **kw)
File "c:python27libsite-packagesscrapy-1.3.2-py2.7.eggscrapycmdline.py", line 149, in _run_command
cmd.run(args, opts)
File "c:python27libsite-packagesscrapy-1.3.2-py2.7.eggscrapycommandscrawl.py", line 57, in run
self.crawler_process.crawl(spname, **opts.spargs)
File "c:python27libsite-packagesscrapy-1.3.2-py2.7.eggscrapycrawler.py", line 162, in crawl
crawler = self.create_crawler(crawler_or_spidercls)
File "c:python27libsite-packagesscrapy-1.3.2-py2.7.eggscrapycrawler.py", line 190, in create_crawler
return self._create_crawler(crawler_or_spidercls)
File "c:python27libsite-packagesscrapy-1.3.2-py2.7.eggscrapycrawler.py", line 194, in _create_crawler
spidercls = self.spider_loader.load(spidercls)
File "c:python27libsite-packagesscrapy-1.3.2-py2.7.eggscrapyspiderloader.py", line 51, in load
raise KeyError("Spider not found: {}".format(spider_name))
KeyError: 'Spider not found: ss'
看起来spiders
文件夹中的__init__.py
文件发生了什么。
尝试自己添加(留空):
───spiders
__init__.py
SSSpider.py
SSSpider.pyc