使用龙卷风未来获取网址,两种不同的方式得到不同的结果



我想使用龙卷风来获取批处理网址。所以我的代码如下所示:

from tornado.concurrent import Future
from tornado.httpclient import AsyncHTTPClient    
from tornado.ioloop import IOLoop    

class BatchHttpClient(object):    
    def __init__(self, urls, timeout=20):    
        self.async_http_client = AsyncHTTPClient()    
        self.urls = urls    
        self.timeout = 20    
    def __mid(self):    
        results = []    
        for url in self.urls:    
            future = Future()    
            def f_callback(f1):    
                future.set_result(f1.result())    
            f = self.async_http_client.fetch(url)    
            f.add_done_callback(f_callback)    
            results.append(future)    
        return results    
    def get_batch(self):    
        results = IOLoop.current().run_sync(self.__mid)    
        return results    

urls = ["http://www.baidu.com?v={}".format(i) for i in range(10)]    
batch_http_client = BatchHttpClient(urls)    
print batch_http_client.get_batch()    

当我运行代码时,发生错误:

ERROR:tornado.application:Exception in callback <function f_callback at 0x7f35458cae60> for <tornado.concurrent.Future object at 0x7f35458c9650>
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/tornado/concurrent.py", line 317, in _set_done
    cb(self)
  File "/home/q/www/base_data_manager/utils/async_util.py", line 21, in f_callback
    future.set_result(f1.result())
  File "/usr/local/lib/python2.7/dist-packages/tornado/concurrent.py", line 271, in set_result
    self._set_done()
  File "/usr/local/lib/python2.7/dist-packages/tornado/concurrent.py", line 315, in _set_done
    for cb in self._callbacks:
TypeError: 'NoneType' object is not iterable

但是如果我像这样更改代码:

class BatchHttpClient(object):
    def __init__(self, urls, timeout=20):
        self.async_http_client = AsyncHTTPClient()
        self.urls = urls
        self.timeout = 20
    def _get_batch(self, url):
        future = Future()
        f = self.async_http_client.fetch(url)
        def callback(f1):
            print future
            print f1.result()
            future.set_result(f1.result())
            print '---------'
        f.add_done_callback(callback)
        return future
    def __mid(self):
        results = []
        for url in self.urls:
            results.append(self._get_batch(url))
        return results
    def get_batch(self):
        results = IOLoop.current().run_sync(self.__mid)
        return results

urls = ["http://www.baidu.com?v={}".format(i) for i in range(10)]
batch_http_client = BatchHttpClient(urls)
for result in batch_http_client.get_batch():
    print result.body

然后它起作用了。我所做的只是添加一个中间函数,为什么结果不同。

在第一个代码片段中,问题在于,在执行回调时,future的值是循环设置的最后一个值。换句话说,当它执行时:

def f_callback(f1):    
    future.set_result(f1.result())    

future的价值总是相同的。如果你添加一个print future,你可以看到这一点:对象的地址将永远相同。

在第二个代码段中,每个未来和每个回调都是在循环调用的函数中创建的。因此,每个回调都从新范围获取future值,从而解决问题。

解决此问题的另一种方法是像这样修改__mid

def __mid(self):
    results = []
    for url in self.urls:
        future = Future()
        def make_callback(future):
            def f_callback(f1):
                future.set_result(f1.result())
            return f_callback
        f = self.async_http_client.fetch(url)
        f.add_done_callback(make_callback(future))
        results.append(future)
    return results

通过在 make_callback(future) 中创建回调,回调中 future 的值来自每个回调的不同范围。

路易斯的回答是正确的,但我想提出一些更简单的替代方案。

首先,您可以使用 functools.partial 而不是 make_callback 包装器函数:

def __mid(self):    
    results = []    
    for url in self.urls:    
        future = Future()    
        def f_callback(output, input):    
            output.set_result(f1.result())    
        f = self.async_http_client.fetch(url)
        # partial() binds the current value of future to
        # the output argument.
        f.add_done_callback(functools.partial(f_callback, future))
        results.append(future)    
    return results    

但是中间Future看起来完全没有必要。这相当于:

def __mid(self):
    return [self.async_http_client.fetch(url) for url in self.urls]

就个人而言,我会将__mid作为协程:

@gen.coroutine
def __mid(self):
    return (yield [self.async_http_client.fetch_url(url) for url in self.urls])

如果不想使用协程,则可能更愿意将回调传递给AsyncHTTPClient.fetch,而不是对其结果使用 Future.add_done_callback

最新更新