我写了这个简单的程序。
import multiprocessing
d = {"what":'1'}
import time
from multiprocessing import Pool
def square(x):
print("Adding process:", x)
d[x]=x
print("Inner d", d)
if __name__ == "__main__":
pool = Pool()
pool.map(square, range(0, 5))
pool.close()
print(d)
输出
('Adding process:', 0)
('Inner d', {0: 0, 'what': '1'})
('Adding process:', 1)
('Inner d', {0: 0, 1: 1, 'what': '1'})
('Adding process:', 2)
('Inner d', {0: 0, 1: 1, 'what': '1', 2: 2})
('Adding process:', 3)
('Inner d', {0: 0, 1: 1, 'what': '1', 3: 3, 2: 2})
('Adding process:', 4)
('Inner d', {0: 0, 1: 1, 'what': '1', 3: 3, 4: 4, 2: 2})
{'what': '1'}
我是多处理的新手,所以我想知道。如何在子流程中重新加载d
的值。
跨多个进程共享字典的最简单方法是使用由SyncManager
实例创建的托管字典,该字典在调用multiprocessing.Manager()
时返回。这是一个相当复杂的话题。当您创建这样一个字典时,返回的是对Syncmanager
管理的字典的proxy
的引用,因此,当您在字典上执行方法时,本质上是在执行更类似于远程过程调用的东西,这可能比在标准本地字典上操作要慢一些。
在下面的代码中,池初始化器用于初始化池中每个进程的全局变量d
,并引用字典代理:
from multiprocessing import Pool, Manager
import time
def init_pool(the_dict):
global d
# initialize the global for each process in the pool:
d = the_dict
def square(x):
print("Adding process:", x, flush=True)
d[x]=x
print("Inner d", d, flush=True)
if __name__ == "__main__":
with Manager() as manager:
d = manager.dict()
d['what'] = '1'
pool = Pool(initializer=init_pool, initargs=(d,))
pool.map(square, range(0, 5))
pool.close() # not necessary
print(d)
打印:
Adding process: 0
Inner d {'what': '1', 0: 0}
Adding process: 1
Inner d {'what': '1', 0: 0, 1: 1}
Adding process: 2
Inner d {'what': '1', 0: 0, 1: 1, 2: 2}
Adding process: 3
Inner d {'what': '1', 0: 0, 1: 1, 2: 2, 3: 3}
Adding process: 4
Inner d {'what': '1', 0: 0, 1: 1, 2: 2, 3: 3, 4: 4}
{'what': '1', 0: 0, 1: 1, 2: 2, 3: 3, 4: 4}
更新
对于不支持print
语句上的flush=True
的Python2,并行打印在试图防止跨进程打印的交错时会带来更多的问题。在这里,我们使用一个锁来确保打印是串行完成的:
from multiprocessing import Pool, Manager, Lock
import time
import sys
def init_pool(the_dict, the_lock):
global d
global lock
# initialize the global for each process in the pool:
d = the_dict
lock = the_lock
def square(x):
d[x]=x
with lock:
print "Adding process:", x, "nInner d", d
sys.stdout.flush()
if __name__ == "__main__":
with Manager() as manager:
d = manager.dict()
d['what'] = '1'
lock = Lock()
pool = Pool(initializer=init_pool, initargs=(d, lock))
pool.map(square, range(0, 5))
pool.close() # not necessary
print d