all!
我需要向 Web 服务发出大约 10,000 个请求,我期望 JSON 作为响应。由于请求彼此独立,因此我想并行运行它们。我想aiohttp
可以帮助我。我写了以下代码:
import asyncio
import aiohttp
async def execute_module(session: aiohttp.ClientSession, module_id: str,
post_body: dict) -> dict:
headers = {
'Content-Type': r'application/json',
'Authorization': fr'Bearer {TOKEN}',
}
async with session.post(
fr'{URL}/{module_id}/steps/execute',
headers=headers,
json=post_body,
) as response:
return await response.json()
async def execute_all(campaign_ids, post_body):
async with aiohttp.ClientSession() as session:
return await asyncio.gather(*[
execute_module(session, campaign_id, post_body)
for campaign_id in campaign_ids
])
campaign_ids = ['101', '102', '103'] * 400
post_body = {'inputs': [{"name": "one", "value": 1}]}
print(asyncio.run(execute_all(campaign_ids, post_body)))
附言我提出了 1,200 个测试请求。
解决它的另一种方法 - 将requests.post
包装在run_in_executor
函数中。我知道在异步函数中使用阻塞代码是错误的,但它的工作速度更快(~7 秒,而 aiohttp 为 ~ 10 秒(
import requests
import asyncio
def execute_module(module_id, post_body):
headers = {
'Content-Type': r'application/json',
'Authorization': fr'Bearer {TOKEN}',
}
return requests.post(
fr'{URL}/{module_id}/steps/execute',
headers=headers,
json=post_body,
).json()
async def execute_all(campaign_ids, post_body):
loop = asyncio.get_running_loop()
return await asyncio.gather(*[
loop.run_in_executor(None, execute_module, campaign_id, post_body)
for campaign_id in campaign_ids
])
campaign_ids = ['101', '102', '103'] * 400
post_body = {'inputs': [{"name": "one", "value": 1}]}
print(asyncio.run(execute_all(campaign_ids, post_body)))
我做错了什么?
你试过 uvloop - https://github.com/MagicStack/uvloop 吗?这应该提高 aiohttp 请求的速度
loop.run_in_executor(None, ...)
线程池(多个线程(中运行同步代码。事件循环在一个线程中运行代码.
我的猜测是等待 IO 应该不会有太大区别,但处理响应(即 json 解码(确实如此。
可能是由于def execute_module
调用不共享requests.Session
,即每个调用都有自己的连接池 https://github.com/psf/requests/blob/main/requests/sessions.py#L831 https://github.com/psf/requests/blob/main/requests/adapters.py#L138
另一方面,async def execute_module
使用共享aiohttp.ClientSession
运行,https://docs.aiohttp.org/en/latest/http_request_lifecycle.html#how-to-use-the-clientsession 限制为 100 个连接
为了检查这一点,它可能会将自定义aiohttp.TCPConnector
传递给具有更大限制的aiohttp.ClientSession
:
- https://docs.aiohttp.org/en/latest/http_request_lifecycle.html#how-to-use-the-clientsession
- https://docs.aiohttp.org/en/latest/client_advanced.html#limiting-connection-pool-size