TensorFlow2-模型子类ValueError



我正在尝试使用TensorFlow 2的模型子类创建LeNet-300-100密集神经网络。我的代码如下:

batch_size = 32
num_epochs = 20

# Load MNIST dataset-
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.mnist.load_data()
X_train = X_train.astype('float32') / 255.0
X_test = X_test.astype('float32') / 255.0
# Convert class vectors/target to binary class matrices or one-hot encoded values-
y_train = tf.keras.utils.to_categorical(y_train, num_classes)
y_test = tf.keras.utils.to_categorical(y_test, num_classes)
X_train.shape, y_train.shape
# ((60000, 28, 28), (60000, 10))
X_test.shape, y_test.shape
# ((10000, 28, 28), (10000, 10)) 


class LeNet300(Model):
def __init__(self, **kwargs):
super(LeNet300, self).__init__(**kwargs)

self.flatten = Flatten()
self.dense1 = Dense(units = 300, activation = 'relu')
self.dense2 = Dense(units = 100, activation = 'relu')
self.op = Dense(units = 10, activation = 'softmax')
def call(self, inputs):
x = self.flatten(inputs)
x = self.dense1(x)
x = self.dense2(x)
return self.op(x)


# Instantiate an object using LeNet-300-100 dense model-
model = LeNet300()
# Compile the defined model-
model.compile(
optimizer=tf.keras.optimizers.Adam(),
loss=tf.keras.losses.SparseCategoricalCrossentropy(),
metrics=['accuracy']
)

# Define early stopping callback-
early_stopping_callback = tf.keras.callbacks.EarlyStopping(
monitor = 'val_loss', min_delta = 0.001,
patience = 3)
# Train defined and compiled model-
history = model.fit(
x = X_train, y = y_train,
batch_size = batch_size, shuffle = True,
epochs = num_epochs,
callbacks = [early_stopping_callback],
validation_data = (X_test, y_test)
)

在呼叫";model.fit((";,它给出以下错误:

ValueError:形状不匹配:标签的形状(已接收(320,((应该等于logits的形状,除了最后一个维度(收到(32,10((。

出了什么问题?

感谢

损失SparseCategoricalCrossentropy不需要一个热编码来计算损失。在文件中,他们提到

当存在两个或多个标签类时,请使用此交叉熵损失函数。我们期望标签被提供为整数。如果您想使用一个热表示提供标签,请使用类别交叉熵损失。对于y_pred,每个特征应该有#classes浮点值,对于y_true,每个特征应有单个浮点值。

因此,您将得到错误。如果您观察到堆叠,则损失函数中会出现错误

/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/keras/losses.py:1569 sparse_categorical_crossentropy
y_true, y_pred, from_logits=from_logits, axis=axis)
/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py:201 wrapper
return target(*args, **kwargs)
/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/keras/backend.py:4941 sparse_categorical_crossentropy
labels=target, logits=output)
/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py:201 wrapper
return target(*args, **kwargs)
/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py:4241 sparse_softmax_cross_entropy_with_logits_v2
labels=labels, logits=logits, name=name)
/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py:201 wrapper
return target(*args, **kwargs)
/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py:4156 sparse_softmax_cross_entropy_with_logits
logits.get_shape()))
ValueError: Shape mismatch: The shape of labels (received (320,)) should equal the shape of logits except for the last dimension (received (32, 10)).

我建议使用CategoricalCrossentropy

这是因为第一个密集层的输入应该被展平。MNIST数据的每个数字都有28x28个网格/图像。这个28x28的数据应该被平坦化为784个输入数字。

所以就在第一个Dense(...)层之前插入Flatten()keras层,即做Flatten()(inputs)

请参阅此压扁层文档以供参考。

最新更新