多处理辅助角色不会启动下一个作业



我正在尝试使用python(3.7(中的多处理模块运行多个模拟。他们将及时完成第一次模拟(功能将结束(,但在相当长的时间内(比一次模拟的长度长得多(,他们不会开始另一次模拟。Python在这段时间里使用了CPU和内存,但我不明白是怎么回事。

我已经粘贴了下面的多处理代码和输出。run_simulation函数太长,调用的其他函数太多,无法粘贴到此问题中。不过这可能是个问题。我试着删除了我认为可能是问题的部分功能,但我找不到问题。模拟将其结果保存到文件中。我认为这可能是问题的原因,但我删除了这部分功能,问题仍然存在。

import os, sys
sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.path.realpath(__file__)))))
from simulation import run_simulation
from datetime import datetime
import multiprocessing as mp

def function_wrapper(idx, sim_type):
run_start_time = datetime.now()
filename = "matlab_data\a" + run_start_time.strftime("%y%m%d_%H_") + str(idx).zfill(4) + ".mat"
print("Starting:", idx, "-", sim_type, "   Start time:", run_start_time.strftime("%H:%M:%S"),
"   Filename:", filename)
try:
output = run_simulation(scenario=sim_type, save_data_to_mat_file=False,
matlab_file_name=filename)
print("Completed: " + str(idx) + "-" + sim_type +
"   Finish time:" + datetime.now().strftime("%H:%M:%S") +
"   Duration:" + str(datetime.now() - run_start_time) +
"   Filename:" + filename)
return output
except Exception as error: # Just so one random error doesn't derail an overnight run
print("Error: " + str(idx) + "-" + sim_type +
"   Finish time:" + datetime.now().strftime("%H:%M:%S") +
"   Duration:" + str(datetime.now() - run_start_time) +
"   Filename:" + filename)
print(error)
return None

def multiple_runs(use_parallel=True):
sim_start_time = datetime.now()
print("Started at:" + sim_start_time.strftime("%d/%m/%y - %H:%M:%S"))
n_sets_of_sims = 30
sim_names = ["TestSim"]
sims_to_run = sim_names * n_sets_of_sims
if use_parallel:
with mp.get_context("spawn").Pool() as pool:
indicies = range(len(sims_to_run))
output = pool.starmap_async(function_wrapper, zip(indicies, sims_to_run))
try:
output.get(timeout=600)
except Exception as error:
print("We lacked patience and got a multiprocessing.TimeoutError")
print(error)
else: # Run Sequentially
for idx, sim_name in enumerate(sims_to_run):
function_wrapper(idx, sim_name)
print("Finished at: " + datetime.now().strftime("%d/%m/%y - %H:%M:%S") + "     " +
"Duration: " + str(datetime.now() - sim_start_time))

if __name__ == '__main__':
mp.freeze_support()
multiple_runs(use_parallel=True)

以下是一些输出示例:

Started at:20/06/21 - 14:51:01
Main loop process id: 15720 
Starting: 0 - TestSim    Start time: 14:51:07    Filename: matlab_dataa210620_14_0000.mat
Starting: 1 - TestSim    Start time: 14:51:08    Filename: matlab_dataa210620_14_0001.mat
Starting: 2 - TestSim    Start time: 14:51:08    Filename: matlab_dataa210620_14_0002.mat
Starting: 3 - TestSim    Start time: 14:51:08    Filename: matlab_dataa210620_14_0003.mat
Starting: 4 - TestSim    Start time: 14:51:08    Filename: matlab_dataa210620_14_0004.mat
Starting: 5 - TestSim    Start time: 14:51:08    Filename: matlab_dataa210620_14_0005.mat
Starting: 6 - TestSim    Start time: 14:51:08    Filename: matlab_dataa210620_14_0006.mat
Starting: 7 - TestSim    Start time: 14:51:08    Filename: matlab_dataa210620_14_0007.mat
Completed: 1-TestSim   Finish time:14:51:51   Duration:0:00:43.349098   Filename:matlab_dataa210620_14_0001.mat
Completed: 3-TestSim   Finish time:14:51:51   Duration:0:00:43.218098   Filename:matlab_dataa210620_14_0003.mat
Completed: 0-TestSim   Finish time:14:51:51   Duration:0:00:43.799112   Filename:matlab_dataa210620_14_0000.mat
Completed: 2-TestSim   Finish time:14:51:51   Duration:0:00:43.709106   Filename:matlab_dataa210620_14_0002.mat
Completed: 4-TestSim   Finish time:14:51:52   Duration:0:00:43.876114   Filename:matlab_dataa210620_14_0004.mat
Completed: 7-TestSim   Finish time:14:51:52   Duration:0:00:43.674106   Filename:matlab_dataa210620_14_0007.mat
Completed: 6-TestSim   Finish time:14:51:52   Duration:0:00:43.852111   Filename:matlab_dataa210620_14_0006.mat
Completed: 5-TestSim   Finish time:14:51:52   Duration:0:00:44.036120   Filename:matlab_dataa210620_14_0005.mat
We lacked patience and got a multiprocessing.TimeoutError
Finished at: 20/06/21 - 15:01:02     Duration: 0:10:00.398742
Process finished with exit code 0

解决方案:我将模拟更改为返回None

模拟函数返回模拟结果。这是一本很大的字典。当我将其更改为返回None时,问题停止了。我不需要返回模拟结果,因为它被保存到一个文件中。

这对我来说已经不是问题了,但我不知道为什么这会有所不同,如果有人有任何建议,我会很感兴趣。

最新更新