Python,在不阻塞事件循环的情况下调用进程池



如果我运行以下代码:

import asyncio
import time
import concurrent.futures
def cpu_bound(mul):
    for i in range(mul*10**8):
        i+=1
    print('result = ', i)
    return i
async def say_after(delay, what):
    print('sleeping async...')
    await asyncio.sleep(delay)
    print(what)
# The run_in_pool function must not block the event loop
async def run_in_pool():
    with concurrent.futures.ProcessPoolExecutor() as executor:
        result = executor.map(cpu_bound, [1, 1, 1])
async def main():
    task1 = asyncio.create_task(say_after(0.1, 'hello'))
    task2 = asyncio.create_task(run_in_pool())
    task3 = asyncio.create_task(say_after(0.1, 'world'))
    print(f"started at {time.strftime('%X')}")
    await task1
    await task2
    await task3
    print(f"finished at {time.strftime('%X')}")
if __name__ == '__main__':
    asyncio.run(main())

输出为:

started at 18:19:28
sleeping async...
result =  100000000
result =  100000000
result =  100000000
sleeping async...
hello
world
finished at 18:19:34

这表明事件循环会阻塞,直到cpu绑定作业(task2(完成,然后继续使用task3

如果我只运行一个cpu绑定作业(run_in_pool如下(:

async def run_in_pool():
    loop = asyncio.get_running_loop()
    with concurrent.futures.ProcessPoolExecutor() as executor:
        result = await loop.run_in_executor(executor, cpu_bound, 1)

然后事件循环似乎没有阻塞,因为输出是:

started at 18:16:23
sleeping async...
sleeping async...
hello
world
result =  100000000
finished at 18:16:28

如何在不阻塞事件循环的情况下在进程池中运行许多cpu绑定作业(在task2中(?

正如您所发现的,您需要使用asyncio自己的run_in_executor来等待提交的任务完成,而不会阻塞事件循环。Asyncio没有提供map的等效功能,但它并不难模仿:

async def run_in_pool():
    with concurrent.futures.ProcessPoolExecutor() as executor:
        futures = [loop.run_in_executor(executor, cpu_bound, i)
                   for i in (1, 1, 1)]
        result = await asyncio.gather(*futures)

最新更新