为什么 aiohttp 的工作速度比 run_in_executor 包装的请求慢?



all!

我需要向 Web 服务发出大约 10,000 个请求,我期望 JSON 作为响应。由于请求彼此独立,因此我想并行运行它们。我想aiohttp可以帮助我。我写了以下代码:

import asyncio
import aiohttp

async def execute_module(session: aiohttp.ClientSession, module_id: str,
post_body: dict) -> dict:
headers = {
'Content-Type': r'application/json',
'Authorization': fr'Bearer {TOKEN}',
}
async with session.post(
fr'{URL}/{module_id}/steps/execute',
headers=headers,
json=post_body,
) as response:
return await response.json()

async def execute_all(campaign_ids, post_body):
async with aiohttp.ClientSession() as session:
return await asyncio.gather(*[
execute_module(session, campaign_id, post_body)
for campaign_id in campaign_ids
])
campaign_ids = ['101', '102', '103'] * 400
post_body = {'inputs': [{"name": "one", "value": 1}]}
print(asyncio.run(execute_all(campaign_ids, post_body)))

附言我提出了 1,200 个测试请求。

解决它的另一种方法 - 将requests.post包装在run_in_executor函数中。我知道在异步函数中使用阻塞代码是错误的,但它的工作速度更快(~7 秒,而 aiohttp 为 ~ 10 秒(

import requests
import asyncio

def execute_module(module_id, post_body):
headers = {
'Content-Type': r'application/json',
'Authorization': fr'Bearer {TOKEN}',
}
return requests.post(
fr'{URL}/{module_id}/steps/execute',
headers=headers,
json=post_body,
).json()
async def execute_all(campaign_ids, post_body):
loop = asyncio.get_running_loop()
return await asyncio.gather(*[
loop.run_in_executor(None, execute_module, campaign_id, post_body)
for campaign_id in campaign_ids
])
campaign_ids = ['101', '102', '103'] * 400
post_body = {'inputs': [{"name": "one", "value": 1}]}
print(asyncio.run(execute_all(campaign_ids, post_body)))

我做错了什么?

你试过 uvloop - https://github.com/MagicStack/uvloop 吗?这应该提高 aiohttp 请求的速度

loop.run_in_executor(None, ...)

线程池(多个线程(中运行同步代码。事件循环在一个线程中运行代码.
我的猜测是等待 IO 应该不会有太大区别,但处理响应(即 json 解码(确实如此。

可能是由于def execute_module调用不共享requests.Session,即每个调用都有自己的连接池 https://github.com/psf/requests/blob/main/requests/sessions.py#L831 https://github.com/psf/requests/blob/main/requests/adapters.py#L138

另一方面,async def execute_module使用共享aiohttp.ClientSession运行,https://docs.aiohttp.org/en/latest/http_request_lifecycle.html#how-to-use-the-clientsession 限制为 100 个连接

为了检查这一点,它可能会将自定义aiohttp.TCPConnector传递给具有更大限制的aiohttp.ClientSession

  • https://docs.aiohttp.org/en/latest/http_request_lifecycle.html#how-to-use-the-clientsession
  • https://docs.aiohttp.org/en/latest/client_advanced.html#limiting-connection-pool-size

最新更新