用于在PyTorch中实现并具有固定随机种子的GPU上训练神经网络的非确定性行为

当我在不同的GPU(机器(上用相同的随机种子运行完全相同的实验(用于训练图像分类的神经网络的相同代码(时，我观察到最终精度的奇怪行为。我只使用一个GPU。准确地说，当我在一台机器上进行实验时，1的准确度是86,37。当我在机器_2上进行实验时，准确度是88.0。当我在同一台机器上多次运行实验时，没有变化。PyTorch和CUDA版本相同。你能帮我找出原因并解决吗？

机器_1：NVIDIA-SMI 440.82驱动程序版本：440.82 CUDA版本：10.2

机器_2：NVIDIA-SMI 440.100驱动程序版本：440.100 CUDA版本：10.2

要修复随机种子，我使用以下代码：

random.seed(args.seed)
os.environ['PYTHONHASHSEED'] = str(args.seed)
np.random.seed(args.seed)
torch.manual_seed(args.seed)
torch.cuda.manual_seed(args.seed)
torch.backends.cudnn.benchmark = False
torch.backends.cudnn.deterministic = True

这就是我使用的：

import torch
import os
import numpy as np
import random
def set_seed(seed):
torch.manual_seed(seed)
torch.cuda.manual_seed_all(seed)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
np.random.seed(seed)
random.seed(seed)
os.environ['PYTHONHASHSEED'] = str(seed)
set_seed(13)

确保你有一个单一的函数，从一次设置种子。如果您使用的是Jupyter笔记本电脑，单元执行时间可能会导致这种情况。内部功能的顺序也可能很重要。我从来没有遇到过这个代码的问题。您可以经常在代码中调用set_seed()。

相关内容

最新更新

热门标签：