我正在使用PyTorch多处理进行编程。我希望所有子进程都可以读取/写入相同的张量列表(不调整大小(。例如,变量可以是
m = list(torch.randn(3), torch.randn(5))
因为每个张量都有不同的大小,所以我无法将它们组织成一个张量。
python列表没有share_memory_((函数和多处理。管理器无法处理张量列表。如何在多个子流程之间共享变量 m?
我自己找到了解决方案。这很简单。只需为每个列表元素调用share_memory_()
即可。列表本身不在共享内存中,但列表元素在共享内存中。
演示代码
import torch.multiprocessing as mp
import torch
def foo(worker,tl):
tl[worker] += (worker+1) * 1000
if __name__ == '__main__':
tl = [torch.randn(2), torch.randn(3)]
for t in tl:
t.share_memory_()
print("before mp: tl=")
print(tl)
p0 = mp.Process(target=foo, args=(0, tl))
p1 = mp.Process(target=foo, args=(1, tl))
p0.start()
p1.start()
p0.join()
p1.join()
print("after mp: tl=")
print(tl)
输出
before mp: tl=
[
1.5999
2.2733
[torch.FloatTensor of size 2]
,
0.0586
0.6377
-0.9631
[torch.FloatTensor of size 3]
]
after mp: tl=
[
1001.5999
1002.2733
[torch.FloatTensor of size 2]
,
2000.0586
2000.6377
1999.0370
[torch.FloatTensor of size 3]
]
@rozyang给出的原始答案不适用于 GPU。它引发的错误就像RuntimeError: CUDA error: initialization error CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
若要修复此问题,请将mp.set_start_method('spawn', force=True)
添加到代码中。以下是代码片段:
import torch.multiprocessing as mp
import torch
def foo(worker,tl):
tl[worker] += (worker+1) * 1000
if __name__ == '__main__':
mp.set_start_method('spawn', force=True)
tl = [torch.randn(2, device='cuda:0'), torch.randn(3, device='cuda:0')]
for t in tl:
t.share_memory_()
print("before mp: tl=")
print(tl)
p0 = mp.Process(target=foo, args=(0, tl))
p1 = mp.Process(target=foo, args=(1, tl))
p0.start()
p1.start()
p0.join()
p1.join()
print("after mp: tl=")
print(tl)