我有一个顺序网络,它接收长度为20个单词的向量句子,目的是根据标签对句子进行分类。每个单词有300个维度。因此,每个句子都有一个形状(20,300)。数据集目前有11个样本,因此完整的x_train的形状为(11,20,300)
下面是我的网络代码:
nnmodel = keras.Sequential()
nnmodel.add(keras.layers.InputLayer(input_shape = (20, 300)))
nnmodel.add(keras.layers.Dense(units = 300, activation = "relu"))
nnmodel.add(keras.layers.Dense(units = 20, activation = "relu"))
nnmodel.add(keras.layers.Dense(units = 1, activation = "sigmoid"))
nnmodel.compile(optimizer='adam',
loss='SparseCategoricalCrossentropy',
metrics=['accuracy'])
nnmodel.fit(x_train, y_train, epochs=10, batch_size = 1)
for layer in nnmodel.layers:
print(layer.output_shape)
这给:
Epoch 1/10
11/11 [==============================] - 0s 1ms/step - loss: 2.9727 - accuracy: 0.0455
Epoch 2/10
11/11 [==============================] - 0s 1ms/step - loss: 2.7716 - accuracy: 0.0682
Epoch 3/10
11/11 [==============================] - 0s 1ms/step - loss: 2.6279 - accuracy: 0.0682
Epoch 4/10
11/11 [==============================] - 0s 1ms/step - loss: 2.4878 - accuracy: 0.0682
Epoch 5/10
11/11 [==============================] - 0s 1ms/step - loss: 2.3145 - accuracy: 0.0545
Epoch 6/10
11/11 [==============================] - 0s 1ms/step - loss: 2.0505 - accuracy: 0.0545
Epoch 7/10
11/11 [==============================] - 0s 1ms/step - loss: 1.7010 - accuracy: 0.0545
Epoch 8/10
11/11 [==============================] - 0s 992us/step - loss: 1.2874 - accuracy: 0.0545
Epoch 9/10
11/11 [==============================] - 0s 891us/step - loss: 0.9628 - accuracy: 0.0545
Epoch 10/10
11/11 [==============================] - 0s 794us/step - loss: 0.7960 - accuracy: 0.0545
(None, 20, 300)
(None, 20, 20)
(None, 20, 1)
为什么我的输出层返回(20,1)?它的形状必须是(1)因为我的标签只是一个整数。我很困惑,也不确定如果形状不对,它是如何计算损失的。
任何帮助都将不胜感激由于
对于当前代码,它是预期的输出。为多维输入添加一个简单的密集层只会改变最后一个维度的大小。如果你注意到,在cnn中,出于同样的原因,我们通常在卷积层之后添加一个Flatten
。Flatten
层本质上是重塑输入数组以移除额外的维度(每个样本现在是一维的)。更新后的代码应该是:
nnmodel = keras.Sequential()
nnmodel.add(keras.layers.InputLayer(input_shape = (20, 300)))
nnmodel.add(keras.layers.Flatten()) #This is the code change
nnmodel.add(keras.layers.Dense(units = 300, activation = "relu"))
nnmodel.add(keras.layers.Dense(units = 20, activation = "relu"))
nnmodel.add(keras.layers.Dense(units = 1, activation = "sigmoid"))
nnmodel.compile(optimizer='adam',
loss='SparseCategoricalCrossentropy',
metrics=['accuracy'])
nnmodel.fit(x_train, y_train, epochs=10, batch_size = 1)
for layer in nnmodel.layers:
print(layer.output_shape)