Python 3：'multiprocessing'和'time'模块不兼容？

我使用multiprocessing.Pool().imap_unordered(...)并行执行一些任务，并通过计算启动池任务前后time.time()的差异来测量所需的时间。

但是，它返回错误的结果！当我在程序运行时观察我的挂钟时，它会告诉我大约 5 秒的运行时间。但程序本身输出的运行时间仅为 0.1 秒。

我也有这段代码的变体，没有任何多处理，它需要双倍的时间，但输出正确的运行时间。

这是我的代码：

if __name__ == "__main__":
    n = int(input("How many grids to create? "))
    use_multiprocessing = None
    while use_multiprocessing is None:
        answer = input("Use multiprocessing to speed things up? (Y/n) ").strip().lower()
        if len(answer) == 1 and answer in "yn":
            use_multiprocessing = True if answer == "y" else False
    t0 = time.time()
    if use_multiprocessing:
        processes = cpu_count()
        worker_pool = Pool(processes)
        print("Creating {} sudokus using {} processes. Please wait...".format(n, processes))
        sudokus = worker_pool.imap_unordered(create_sudoku, range(n), n // processes + 1)
    else:
        progress_bar, progress_bar_length = 0, 10
        sudokus = []
        print("Creating {} sudokus".format(n), end="", flush=True)
        for i in range(n):
            p = int((i / n) * progress_bar_length)
            if p > progress_bar:
                print("." * (p-progress_bar), end="", flush=True)
                progress_bar = p
            new_sudoku = create_sudoku()
            sudokus.append(new_sudoku)
    t = time.time() - t0
    l = len(list(sudokus))
    print("nSuccessfully created {} grids in {:.6f}s (average {:.3f}ms per grid)!".format(
        l, t, 1000*t/l
    ))

这里有一个示例运行，实际上大约需要 5-6 秒（当然，在输入要创建的网格数量以及是否使用多处理之后）：

How many grids to create? 100000
Use multiprocessing to speed things up? (Y/n) y
Creating 100000 sudokus using 4 processes. Please wait...
Successfully created 100000 grids in 0.122141s (average 0.001ms per grid)!
Process finished with exit code 0

multiprocessing和time.time()不兼容吗？我听说在这种情况下time.clock()可能会制造问题，但我认为time.time()应该是安全的。还是还有其他问题？

我想通了。

Pool.imap_unordered(...)返回一个生成器，没有列表。这意味着，当方法完成时，它的元素还没有创建，而只是在我访问它们后立即创建。

我在第 l = len(list(sudokus)) 行中执行此操作，在那里我将生成器转换为列表以获取长度。完成时间在此之前测量了一行，因此它正确地报告了初始化生成器所需的时间。这不是我想要的，所以交换这两行会导致正确的时间。

我知道我可能不会仅仅为了找出长度而将生成器转换为列表，然后再次丢弃列表。如果我想要一个生成器，我必须依靠保存的请求长度，或者我必须使用 Pool.map(...) 来生成列表并阻止，直到它准备好。

相关内容

最新更新

热门标签：