将 asyncio 与多工作线程 ProcessPoolExecutor 组合在一起



是否可以采用诸如work之类的阻塞函数,并使其在具有多个工作线程的ProcessPoolExecutor中并发运行?

import asyncio
from time import sleep, time
from concurrent.futures import ProcessPoolExecutor
num_jobs = 4
queue = asyncio.Queue()
executor = ProcessPoolExecutor(max_workers=num_jobs)
loop = asyncio.get_event_loop()
def work():
sleep(1)
async def producer():
for i in range(num_jobs):
results = await loop.run_in_executor(executor, work)
await queue.put(results)
async def consumer():
completed = 0
while completed < num_jobs:
job = await queue.get()
completed += 1
s = time()
loop.run_until_complete(asyncio.gather(producer(), consumer()))
print("duration", time() - s)

在具有 4 个以上内核的计算机上运行上述操作需要 ~4 秒。您将如何编写producer以使上述示例仅花费~1秒?

await loop.run_in_executor(executor, work)阻塞循环,直到work完成,因此您一次只能运行一个函数。

要并发运行作业,您可以使用asyncio.as_completed

async def producer():
tasks = [loop.run_in_executor(executor, work) for _ in range(num_jobs)]
for f in asyncio.as_completed(tasks, loop=loop):
results = await f
await queue.put(results)

问题出在producer.它不允许作业在后台运行,而是等待每个作业完成,从而序列化它们。如果您将producer重写为如下所示(并保持consumer不变(,您将获得预期的 1 秒持续时间:

async def producer():
for i in range(num_jobs):
fut = loop.run_in_executor(executor, work)
fut.add_done_callback(lambda f: queue.put_nowait(f.result()))

相关内容

  • 没有找到相关文章

最新更新