我正在开发一个MCTS算法,并试图通过并行展开多个叶子来并行化一些工作。在进行了一批展开之后,我想在选择另一批叶子进行展开之前,返回并将结果添加到树中(撤消虚拟损失(。除了速度之外,这一切都很好——我发现围绕ProcessPoolExecutor上下文的连续循环越来越慢。代码部分如下:
for _ in range(8):
tick = time.time()
with concurrent.futures.ProcessPoolExecutor() as executor:
value_estimates = executor.map(NeuralNet.evaluate, leaves, chunksize=round(batch_size/8)+1)
tock = time.time()
然后我得到以下计时结果。
Took 2.3520002365112305 sec to run 500 times
Took 2.5691237449645996 sec to run 500 times
Took 2.8875749111175537 sec to run 500 times
Took 3.2885916233062744 sec to run 500 times
Took 3.43363618850708 sec to run 500 times
Took 3.6769683361053467 sec to run 500 times
Took 3.948704719543457 sec to run 500 times
Took 4.299146890640259 sec to run 500 times
同样的模式发生在更大的循环中,所花费的时间只是不断增长。
这是什么原因?有没有办法让每个循环都以相同的速度进行?
提前感谢
我建议使用库多处理。也许会很有趣。
import numpy as np from concurrent.futures._base import as_completed import concurrent.futures import time from multiprocessing import Pool, cpu_count
data = np.random.randint(0, 1000000, 100000000) N = 10000
def label_data(i):
print(f'Started executing iteration {i} for data of length {len(data)}')
#print(f'Started executing iteration {i} for data of length ')
t0= time.clock() futures = {} with concurrent.futures.ProcessPoolExecutor() as executor:
for i in range(N):
executor.submit(label_data, i)
for future in as_completed(futures):
iteration = futures[future] t1 = time.clock() - t0 print("Time elapsed: ", t1) # CPU seconds elapsed (floating point) time_concurrent
= t1 """ multiprocessing.cpu_count()-1 """
t0= time.clock() with Pool(cpu_count-1) as p:
p.map(label_data, [i for i in range(0,N)]) t1 = time.clock() - t0 print("Time elapsed: ", t1) # CPU seconds elapsed (floating point) time_multiprocessing = t1
print("------------------------------") print("Concurrent library time: ", time_concurrent) print("Multiprocessing library time: ", time_multiprocessing)
以下是示例中的一些结果
Concurrent library time: 4.678165
Multiprocessing librar
y时间:0.01712899999999728