在RTX 5 Ti GPU上训练Yolov3060 我收到错误"RuntimeError: Unable to find a valid cuDNN algorithm to run convolut



使用以下命令在RTX 3060 Ti GPU上使用——img 8088和批量大小16训练Yolov5

python train.py——img 1088——batch 16——epochs 3——data coco128. py

我得到以下异常"RuntimeError:无法找到有效的cuDNN算法来运行卷积"通过将批大小减少到8个,我可以训练模型

File "train.py", line 611, in <module>
main(opt)
File "train.py", line 509, in main
train(opt.hyp, opt, device)
File "train.py", line 311, in train
pred = model(imgs)  # forward
File "C:Program FilesPython38libsite-packagestorchnnmodulesmodule.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "C:Usershamza.mworkspaceyolov5modelsyolo.py", line 123, in forward
return self.forward_once(x, profile, visualize)  # single-scale inference, train
File "C:Usershamza.mworkspaceyolov5modelsyolo.py", line 155, in forward_once
x = m(x)  # run
File "C:Program FilesPython38libsite-packagestorchnnmodulesmodule.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "C:Usershamza.mworkspaceyolov5modelscommon.py", line 137, in forward
return self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), dim=1))
File "C:Program FilesPython38libsite-packagestorchnnmodulesmodule.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "C:Usershamza.mworkspaceyolov5modelscommon.py", line 45, in forward
return self.act(self.bn(self.conv(x)))
File "C:Program FilesPython38libsite-packagestorchnnmodulesmodule.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "C:Program FilesPython38libsite-packagestorchnnmodulesconv.py", line 423, in forward
return self._conv_forward(input, self.weight)
File "C:Program FilesPython38libsite-packagestorchnnmodulesconv.py", line 419, in _conv_forward
return F.conv2d(input, weight, self.bias, self.stride,
RuntimeError: Unable to find a valid cuDNN algorithm to run convolution

p。S还有谁能指导我如何评估哪种GPU最适合训练我的模型,请也给我一些启发

答案在错误日志

RuntimeError: CUDA out of memory。试图分配100.00 MiB (GPU 0;8.00 GiB总容量;5.48 GiB已分配;81.94 MiB free;5.61 GiB被PyTorch保留)

它试图分配比你的GPU更多的内存。

尝试减少batch_size,我有同样的问题,当我减少批大小,它为我工作!

相关内容

最新更新