我正在与CNN一起执行培训任务。当我使用CrossEntropyLoss创建损失函数并训练数据集时,错误提醒我批次大小不匹配。这是培训的主要代码:
net = SimpleConvolutionalNetwork()
train_history, val_history = train(net, batch_size=32, n_epochs=10, learning_rate=0.001)
plot_losses(train_history, val_history)
这是神经元网络代码:
class SimpleConvolutionalNetwork(nn.Module):
# Q: why the scope of input not changed after relu??
def __init__(self) -> None:
super(SimpleConvolutionalNetwork, self).__init__()
# define convolutional filting layer(3 grids) and output size(18 channels)
self.conv1 = nn.Conv2d(3, 18, kernel_size=3, stride=1, padding=1)
# define pooling layer with max-pooling function
self.pool = nn.MaxPool2d(kernel_size=2, stride=2, padding=0)
# define FCL and output layer by Linear function
self.fc1 = nn.Linear(18*16*16, 64)
self.fc2 = nn.Linear(64, 10)
# Q: where the pooling layer??
def forward(self, x):
# input shape: 3(grids) * 32 * 32(32*32 is the scope of each grid)
# filted by conv1 defined in the construction function
# then relu the filted x
x = F.relu(self.conv1(x))
# now let 18*32*32 -> 18*16*16
x = x.view(-1, 18*16*16)
# two step for 18*16*16(totally 4608) -> 64
# output by FC firstly, then relu again the output
x = F.relu(self.fc1(x))
# 64 -> 10 finally
x = self.fc2(x)
return x
在列车功能中,错误的地方在于损失函数的构造。因为这是一个很长的上下文,所以主要部分如下所示:
def train(net, batch_size, n_epochs, learning_rate):
...
# load the training dataset
train_loader = get_train_loader(batch_size)
# get validation dataset
val_loader = get_val_loader(batch_size)
# set batch size
n_minibatches = len(train_loader)
# set loss function and validation test checking
criterion, optimizer = createLossAndOptimizer(net, learning_rate)
train_history = []
val_history = []
training_start_time = time.time()
best_error = np.inf
best_model_path = "best_model_path"
# GPU if possible
net = net.to(device)
for epoch in range(n_epochs):
running_loss = 0.0
print_every = n_minibatches
start_time = time.time()
total_train_loss = 0.0
# step1: training the datasets
for i, (inputs, labels) in enumerate(train_loader):
inputs, labels = inputs.to(device), labels.to(device)
optimizer.zero_grad()
# forward + backward + optimize
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
#print statistics
running_loss += loss.item()
total_train_loss += loss.item()
# print every 10th of epoch
if (i + 1) % (print_every + 1) == 0:
print("Epoch {}, {:d}% t train_loss: {:.2f} took: {:.2f}s".format(
epoch + 1, int(100 * (i + 1) / n_minibatches), running_loss / print_every,
time.time() - start_time))
running_loss = 0.0
start_time = time.time()
train_history.append(total_train_loss / len(train_loader))
...
损失构建函数和数据集加载如下:
def createLossAndOptimizer(net, learning_rate=0.001):
# define a cross-entropy loss function:
criterion = nn.CrossEntropyLoss()
# optimizer include three parameters: net, learning rate, and
# momentum rate for validate the dataset from over-fitting(default
# value is 0.9)
optimizer = opt.Adam(net.parameters(), lr=learning_rate)
return criterion, optimizer
def get_train_loader(batch_size):
return th.utils.data.DataLoader(train_set,batch_size=batch_size,sampler=train_sampler, num_workers=num_workers)
def get_val_loader(batch_size):
return th.utils.data.DataLoader(train_set,batch_size=batch_size,sampler=train_sampler, num_workers=num_workers)
然而,这个错误提醒我,输入的批量大小大于目标批量大小:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-19-07b692e7a2bb> in <module>()
173 net = SimpleConvolutionalNetwork()
174
--> 175 train_history, val_history = train(net, batch_size=32, n_epochs=10, learning_rate=0.001)
176
177 plot_losses(train_history, val_history)
3 frames
/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction, label_smoothing)
2844 if size_average is not None or reduce is not None:
2845 reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 2846 return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
2847
2848
ValueError: Expected input batch_size (128) to match target batch_size (32).
我主要认为我错误地设置了不正确的参数,因为"标签"是4号。但我不知道怎么解决。谢谢你的回答。
在应用conv1
后的SimpleConvolutionalNetwork
的forward
方法中,张量x
具有(batch_size, 18, 32, 32)
的形状。因此,当执行x = x.view(-1, 18 * 16 * 16)
时,x
的形状变为(batch_size * 4, 18 * 16 * 16)
,并且由于完全连接的层进一步应用不会改变这个新的批量大小,所以输出具有形状(batch_size * 4, 10)
。我的建议是在卷积之后立即使用池,比如:
x = F.relu(self.conv1(x)) # after that x will have shape (batch_size, 18, 32, 32)
x = self.pool(x) # after that x will have shape (batch_size, 18, 16, 16)
这种方法将返回形状为(batch_size, 10)
的张量,并且不会发生批量大小不匹配错误。