在多处理中使用共享列表的正确方式是什么



我在Python(3.7版(中借助多处理的Manager, Lock实现了一个SharedList。我已经将它用作使用多处理Process函数调用创建的进程之间的共享对象。共享列表用于存储每个进程生成的值/对象共享它。

用Python 的multiprocessingManagerLock实现SharedList

class SharedList(object):
def __init__(self, limit):
self.manager = Manager()
self.results = self.manager.list([])
self.lock = Lock()
self.limit = limit
def append(self, new_value):
with self.lock:
if len(self.results) == self.limit:
return False
self.results.append(new_value)
return True
def list(self):
with self.lock:
return list(self.results).copy()

使用创建的SharedList存储使用multiprocessing创建的多个进程的值

results = SharedList(limit)
num_processes = min(process_count, limit)
processes = []
for i in range(num_processes):
new_process = Process(target=child_function, args=(results))
processes.append(new_process)
new_process.start()
for _process in processes:
_process.join()
for _process in processes:
_process.close()

child_function的实现

while True:
result = func()
if not (results.append(result)):
break

一些场景的实现是可行的,但当我增加了限制时就会挂断。我使用的处理器数量少于CPU的数量,做同样的实验仍然挂在同样的位置。

有没有更好的方法来解决上述问题,我已经研究了不同的方法,比如使用Queue,但这并不像预期的那样有效,挂断了吗?

添加了以前使用队列的实现

使用队列实现

results_out = []
manager = multiprocessing.Manager()
results = manager.Queue()
tasks = manager.Queue()
num_processes = min(process_count, limit)
processes = []
for i in range(num_processes):
new_process = multiprocessing.Process(target=child_function,
args=(tasks, results)
processes.append(new_process)
new_process.start()
sleep(5)
for i in range(limit):
tasks.put(0)
sleep(1)
for i in range(num_processes):
tasks.put(-1)
num_finished_processes = 0
while True:
new_result = results.get()
if new_result == -1:
num_finished_processes += 1
if num_finished_processes == num_processes:
break
else:
results_out.append(new_result)
for process in processes:
process.join()
for process in processes:
process.close()

child_function

while True:
task_val = tasks.get()
if task_val < 0:
results.put(-1)
break
else:
result = func()
results.put(result)

更新

在发布这个问题之前,我已经阅读了以下参考资料,但我无法获得所需的输出。我同意,这段代码导致了死锁状态,但我无法在python 中使用多处理找到没有死锁的实现

参考

  1. 共享列表的多处理

  2. https://pymotw.com/2/multiprocessing/basics.html

  3. python中的共享变量';s多处理

  4. https://eli.thegreenplace.net/2012/01/04/shared-counter-with-pythons-multiprocessing

  5. https://medium.com/@urban_ainstitute/using-multiprocessing-to-make-python-code-foster-23ea5ef996ba

  6. http://kmdouglass.github.io/posts/learning-pythons-multiprocessing-module/

  7. python多处理/线程清理

根据建议,我可以使用Queue修改SharedList

class SharedList(object):
def __init__(self, limit):
self.manager = Manager()
self.tasks = self.manager.Queue()
self.results = self.manager.Queue()
self.limit = limit
self.no_of_process = min(process_count, limit)
def setup(self):
sleep(1)
for i in range(self.limit):
self.tasks.put(0)
sleep(1)
for i in range(self.no_of_process):
self.tasks.put(-1)
def append(self, new_value):
task_val = self.tasks.get()
if task_val < 0:
self.results.put(-1)
return False
else:
self.results.put(new_value)
return True
def list(self):
results_out = []
num_finished_processes = 0
while True:
new_result = self.results.get()
if new_result == -1:
num_finished_processes += 1
if num_finished_processes == self.no_of_process:
break
else:
results_out.append(new_result)
return results_out

此实现工作良好,以下实现更改

results = SharedList(limit)
num_processes = min(process_count, limit)
processes = []
for i in range(num_processes):
new_process = Process(target=child_function, args=(results))
processes.append(new_process)
new_process.start()
results.setup()
for _process in processes:
_process.join()
for _process in processes:
_process.close()

child_function的实现

while True:
result = func()
if not (results.append(result)):
break

但是,在的一些迭代之后,这再次陷入了僵局

我在Ray的基础上找到了下面的文章,这听起来很有趣,很容易实现并行计算,有效地实现了高效的

https://towardsdatascience.com/modern-parallel-and-distributed-python-a-quick-tutorial-on-ray-99f8d70369b8

根据建议,我可以使用Queue修改SharedList

class SharedList(object):
def __init__(self, limit):
self.manager = Manager()
self.tasks = self.manager.Queue()
self.results = self.manager.Queue()
self.limit = limit
self.no_of_process = min(process_count, limit)
def setup(self):
sleep(1)
for i in range(self.limit):
self.tasks.put(0)
sleep(1)
for i in range(self.no_of_process):
self.tasks.put(-1)
def append(self, new_value):
task_val = self.tasks.get()
if task_val < 0:
self.results.put(-1)
return False
else:
self.results.put(new_value)
return True
def list(self):
results_out = []
num_finished_processes = 0
while True:
new_result = self.results.get()
if new_result == -1:
num_finished_processes += 1
if num_finished_processes == self.no_of_process:
break
else:
results_out.append(new_result)
return results_out

此实现工作良好,以下实现更改

results = SharedList(limit)
num_processes = min(process_count, limit)
processes = []
for i in range(num_processes):
new_process = Process(target=child_function, args=(results))
processes.append(new_process)
new_process.start()
results.setup()
for _process in processes:
_process.join()
for _process in processes:
_process.close()

child_function的实现

while True:
result = func()
if not (results.append(result)):
break

相关内容

  • 没有找到相关文章

最新更新