刮擦: 错误: 下载时出错 <GET http://stackoverflow.com/questions?sort=votes> 类型错误: 'float' 对象不可迭代



我是python和scrapy的新学习者,我从视频中复制这些代码,它们在视频中运行良好,但是当我尝试一下时,就会有一个 typeerror " float'对象"是不可能的,这是代码

import scrapy
class StackOverflowSpider(scrapy.Spider):
name="stackoverflow"
start_urls=["http://stackoverflow.com/questions?sort=votes"]
def parse(self,response):
    for href in response.css('.question-summary h3 a::attr(href)'):
        full_url=response.urljoin(href.extract())
        yield scrapy.Request(full_url,callback=self.parse_question)
def parse_question(self,response):
    yield {
        'title':response.css('h1 a::text').extract()[0],
        'votes':response.css(".question.vote-count-post::text").extract()[0],
        'body':response.css(".question.post-text").extract()[0],
        'tags':response.css(".question.post-tag::text").extract(),
        'link':response.url,
    }

那么这是错误:

2017-03-10 16:06:39 [scrapy] INFO: Enabled item pipelines:[]
2017-03-10 16:06:39 [scrapy] INFO: Spider opened
2017-03-10 16:06:39 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2017-03-10 16:06:39 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:6023
2017-03-10 16:06:40 [scrapy] ERROR: Error downloading <GET http://stackoverflow.com/questions?sort=votes>
Traceback (most recent call last):
  File "C:Anaconda2libsite-packagestwistedinternetdefer.py", line 1299, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "C:Anaconda2libsite-packagestwistedpythonfailure.py", line 393, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "C:Anaconda2libsite-packagesscrapycoredownloadermiddleware.py", line 43, in process_request
    defer.returnValue((yield download_func(request=request,spider=spider)))
  File "C:Anaconda2libsite-packagesscrapyutilsdefer.py", line 45, in mustbe_deferred
    result = f(*args, **kw)
  File "C:Anaconda2libsite-packagesscrapycoredownloaderhandlers__init__.py", line 65, in download_request
    return handler.download_request(request, spider)
  File "C:Anaconda2libsite-packagesscrapycoredownloaderhandlershttp11.py", line 60, in download_request
    return agent.download_request(request)
  File "C:Anaconda2libsite-packagesscrapycoredownloaderhandlershttp11.py", line 285, in download_request
    method, to_bytes(url, encoding='ascii'), headers, bodyproducer)
  File "C:Anaconda2libsite-packagestwistedwebclient.py", line 1631, in request
    parsedURI.originForm)
  File "C:Anaconda2libsite-packagestwistedwebclient.py", line 1408, in _requestWithEndpoint
    d = self._pool.getConnection(key, endpoint)
  File "C:Anaconda2libsite-packagestwistedwebclient.py", line 1294, in getConnection
    return self._newConnection(key, endpoint)
  File "C:Anaconda2libsite-packagestwistedwebclient.py", line 1306, in _newConnection
    return endpoint.connect(factory)
  File "C:Anaconda2libsite-packagestwistedinternetendpoints.py", line 788, in connect
    EndpointReceiver, self._hostText, portNumber=self._port
  File "C:Anaconda2libsite-packagestwistedinternet_resolver.py", line 174, in resolveHostName
    onAddress = self._simpleResolver.getHostByName(hostName)
  File "C:Anaconda2libsite-packagesscrapyresolver.py", line 21, in getHostByName
    d = super(CachingThreadedResolver, self).getHostByName(name, timeout)
  File "C:Anaconda2libsite-packagestwistedinternetbase.py", line 276, in getHostByName
    timeoutDelay = sum(timeout)
TypeError: 'float' object is not iterable
2017-03-10 16:06:40 [scrapy] INFO: Closing spider (finished)
2017-03-10 16:06:40 [scrapy] INFO: Dumping Scrapy stats:
{'downloader/exception_count': 1,
 'downloader/exception_type_count/exceptions.TypeError': 1,
 'downloader/request_bytes': 235,
 'downloader/request_count': 1,
 'downloader/request_method_count/GET': 1,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2017, 3, 10, 8, 6, 40, 117000),
 'log_count/DEBUG': 1,
 'log_count/ERROR': 1,
 'log_count/INFO': 7,
 'scheduler/dequeued': 1,
 'scheduler/dequeued/memory': 1,
 'scheduler/enqueued': 1,
 'scheduler/enqueued/memory': 1,
 'start_time': datetime.datetime(2017, 3, 10, 8, 6, 39, 797000)}
2017-03-10 16:06:40 [scrapy] INFO: Spider closed (finished)

感谢您的帮助!

您的代码在python3中工作,但是这些项目是空列表,我删除索引并再次运行:

2017-03-10 16:48:34 [scrapy.core.scraper] DEBUG: Scraped from <200 http://stackoverflow.com/questions/179123/how-to-modify-existing-unpushed-commits>
{'link': 'http://stackoverflow.com/questions/179123/how-to-modify-existing-unpushed-commits', 'title': ['How to modify existing, unpushed commits?'], 'votes': [], 'body': [], 'tags': []}

我知道这是一个古老的问题。但是我在我的情况下找到了一个不同的解决方案:也许您应该尝试conda install scrapy而不是pip install scrapy

这是运行命令后安装的依赖项:

将安装以下新软件包:    attrs:15.2.0-py27_0    自动:0.5.0-PY27_0    不断:15.1.0-py27_0    CSSSELECT:1.0.1-PY27_0    超链接:17.1.1-PY27_0    增量:16.10.1-PY27_0    parsel:1.2.0-py27_0    PYASN1:0.2.3-PY27_0    PYASN1模型:0.0.8-PY27_0    PYDISPATCHER:2.0.5-PY27_0    Queuelib:1.4.2-PY27_0    砂纸:1.3.3-PY27_0    service_identity:17.0.0-py27_0    扭曲:17.5.0-PY27_0    W3lib:1.17.0-PY27_0    ZOPE:1.0-PY27_0    Zope.interface:4.4.2-PY27_0

相关内容

最新更新