如何利用concurrent.futures的python多处理的所有核心

我正在编写一个简单的脚本来对图像数据集进行一些预处理，其中包括调整大小和添加过滤器。

这是我的代码：

def preprocessing(tar_ratio, img_paths, label_paths, 
save_dir="output", resampling_mode=None):
# with concurrent.futures.ThreadPoolExecutor() as executor:
with concurrent.futures.ProcessPoolExecutor() as executor:
for img_path, label_path in zip(img_paths, label_paths):
src_ratio = get_ratio(label_path)
if src_ratio is not np.nan:
executor.submit(
process_single(src_ratio, tar_ratio, img_path, label_path, 
save_dir=save_dir, resampling_mode=resampling_mode)
)
else:
pass

我认为它更受CPU限制，所以multiprocessing比multithreading更合适。但在尝试了这两种方法后，在只使用两个CPU内核的情况下，两者都没有像预期的那样工作。

我读过下面的帖子，我想知道是否有使用concurrent.futures的更新版本？如何利用python多处理的所有核心

Executor.submit方法接受一个可调用的作为第一个参数，但您调用了该函数，请尝试以下操作：

executor.submit(
process_single,
src_ratio,
tar_ratio,
img_path,
label_path,
save_dir=save_dir,
resampling_mode=resampling_mode,
)

一个简单的例子说明了正确的用法：

测试.py:

import random
import time
from concurrent.futures import ProcessPoolExecutor

def worker(i):
t = random.uniform(1, 5)
print(f"START: {i} ({t:.2f}s)")
time.sleep(t)
print(f"END: {i}")
return i * 2

def main():
futures = []
with ProcessPoolExecutor() as executor:
for i in range(5):
futures.append(executor.submit(worker, i))
print([f.result() for f in futures])

if __name__ == "__main__":
main()

示例：

$ python test.py
START: 0 (3.16s)
START: 1 (1.68s)
START: 2 (2.76s)
START: 3 (1.53s)
START: 4 (4.05s)
END: 3
END: 1
END: 2
END: 0
END: 4
[0, 2, 4, 6, 8]

相关内容

最新更新

热门标签：