Pytorch nll_loss在训练循环中返回恒定损失



我有一个图像二进制分类问题,我想对图像是ant还是bee进行分类。我已经刮了图像,我做了所有的清洁,整形,转换为灰度。图像的大小为200x200单通道灰度。在跳到Conv Nets.之前,我首先想使用Feed Forwad NN来解决这个问题。

我在训练循环中的问题是,我得到了一个常数loss,我使用AdamOptimizer,F.log_softmax作为网络的最后一层,以及nll_loss函数。到目前为止,我的代码如下:

FF-网络

class Net(nn.Module):
def __init__(self):
super().__init__()
self.fc1 = nn.Linear(in_features , 64)
self.fc2 = nn.Linear(64, 64)
self.fc3 = nn.Linear(64, 32)
self.fc4 = nn.Linear(32, 2)

def forward(self, X):
X = F.relu(self.fc1(X))
X = F.relu(self.fc2(X))
X = F.relu(self.fc3(X))
X = F.log_softmax(self.fc4(X), dim=1)
return X

net = Net()

训练循环

optimizer = torch.optim.Adam(net.parameters(), lr=0.001)
EPOCHS = 10
BATCH_SIZE = 5
for epoch in range(EPOCHS):
print(f'Epochs: {epoch+1}/{EPOCHS}')
for i in range(0, len(y_train), BATCH_SIZE):
X_batch = X_train[i: i+BATCH_SIZE].view(-1,200 * 200)
y_batch = y_train[i: i+BATCH_SIZE].type(torch.LongTensor)

net.zero_grad() ## or you can say optimizer.zero_grad()

outputs = net(X_batch)
loss = F.nll_loss(outputs, y_batch)
loss.backward()
optimizer.step()
print("Loss", loss)

我怀疑问题可能出在我的批处理和损失函数上。我将感谢任何帮助注意:这些图像是形状为(200, 200)的灰度图像。

我一直在等待答案,但我甚至无法得到评论。我为自己找到了解决方案,也许这可以帮助未来的某个人。

class Net(nn.Module):
def __init__(self):
super().__init__()
self.fc1 = nn.Linear(200 * 200 , 64) # 200 * 200 are in_features, which is an image of shape 200*200 (gray image)
self.fc2 = nn.Linear(64, 64)
self.fc3 = nn.Linear(64, 32)
self.fc4 = nn.Linear(32, 2)

def forward(self, X):
X = F.relu(self.fc1(X))
X = F.relu(self.fc2(X))
X = F.relu(self.fc3(X))
X = self.fc4(X) # I removed the activation function here, 
return X

net = Net()
# I changed the loss function to CrossEntropyLoss() since i didn't apply activation function on the last layer
loss_function = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(net.parameters(), lr=0.001)
EPOCHS = 10
BATCH_SIZE = 5
for epoch in range(EPOCHS):
print(f'Epochs: {epoch+1}/{EPOCHS}')
for i in range(0, len(y_train), BATCH_SIZE):
X_batch = X_train[i: i+BATCH_SIZE].view(-1, 200 * 200)
y_batch = y_train[i: i+BATCH_SIZE].type(torch.LongTensor)

net.zero_grad() ## or you can say optimizer.zero_grad()

outputs = net(X_batch)
loss = loss_function(outputs, y_batch)
loss.backward()
optimizer.step()
print("Loss", loss)

最新更新