RuntimeError:CUDA错误:设备端断言触发-第二次调用模型时



我在使用PyTorch模型时出现以下错误:

/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in embedding(input, weight, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse)
2197         # remove once script supports set_grad_enabled
2198         _no_grad_embedding_renorm_(weight, input, max_norm, norm_type)
-> 2199     return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
2200 
2201 
RuntimeError: CUDA error: device-side assert triggered

错误似乎只在我第二次调用模型时发生我的代码:

epochs =  500
losses = []
model.to(device)
for e in range(epochs):
running_loss = 0
current_batch = 1
for x1, x2, y in data_loader:    
print("x1 to device")
x3 = x1.to(device)
print("--- Computing embedding1 ---")
embedding1 = model(x3, pooling_method=pooling_method)
print(embedding1.size())
print("x2 to device")
x4 = x2.to(device)
print("--- Computing embedding2 ---")
embedding2 = model(x4, pooling_method=pooling_method)
print(embedding2.size())

输出:

x1 to device
--- Computing embedding1 ---
torch.Size([64, 768])
x2 to device
--- Computing embedding2 ---
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-29-6b36cff704b2> in <module>
21     x4 = x2.to(device)
22     print("--- Computing embedding2 ---")
---> 23     embedding2 = model(x4, pooling_method=pooling_method)
24     print(embedding2.size())
25 
8 frames
/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in embedding(input, weight, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse)
2197         # remove once script supports set_grad_enabled
2198         _no_grad_embedding_renorm_(weight, input, max_norm, norm_type)
-> 2199     return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
2200 
2201 
RuntimeError: CUDA error: device-side assert triggered

输入具有相同的形状,因此问题不在于形状。这个错误似乎是在模型计算输出时发生的,但只是第二次。

该设备为:

device(type='cuda', index=0)

如果需要,模型是:

class BERT(nn.Module):
"""
Torch model based on CamemBERT, in order to make sentence embeddings
"""
def __init__(self, tokenizer, model_name=model_name, output_size=100):
super().__init__()
self.bert = CamembertModel.from_pretrained(model_name)
self.bert.resize_token_embeddings(len(tokenizer))

def forward(self, x, pooling_method='cls'):
hidden_states = self.bert(x).last_hidden_state
embedding = pooling(hidden_states, pooling_method=pooling_method)
return embedding

有人知道如何解决这个问题吗?

以下两个原因导致CUDA错误发生:

  1. 标签/类的数量与输出单位:在您的情况下,In可以是嵌入大小的输入/输出
  2. 损失函数的输入可能不正确:不确定您使用的是什么损失,或者您是否在BERT中将其从默认值更改为默认值

请在此处查看解决方案-->https://builtin.com/software-engineering-perspectives/cuda-error-device-side-assert-triggered

相关内容

  • 没有找到相关文章

最新更新