Keras 丢失输出不会更改 - 已检查激活和开放主题



我目前正在为我的CNN而苦苦挣扎。我使用categorial_crossentropy,我将添加我的模型。累计费用既不增加也不减少损失。标记的数据量现在是 600,这是相当小的,但对我来说没有任何变化似乎很奇怪。

### Define architecture.
model.add(Conv2D(32, 4, strides=(11,11),padding="same",input_shape=(200,200,3), activation="relu"))
model.add(Dropout(0.2))
model.add(BatchNormalization())
model.add(Conv2D(64, 4, strides=(9,9),padding="same", activation="relu"))
model.add(Dropout(0.2))
model.add(BatchNormalization())
model.add(Conv2D(128, 4, strides=(5,5),padding="same", activation="relu"))
model.add(Dropout(0.2))
model.add(BatchNormalization())
model.add(GlobalMaxPooling2D())
model.add(Dense(128, activation="relu"))
model.add(Dense(y_test.shape[1], activation="sigmoid"))
model.summary()
sgd = optimizers.SGD(lr=0.1,) #0.1
model.compile(loss='categorical_crossentropy', optimizer='sgd', 
              metrics=['accuracy'])

model1 = model.fit(x_train, y_train,batch_size=32, epochs=10, verbose=1)
Epoch 1/10
420/420 [==============================] - 5s 11ms/step - loss: 1.4598 - acc: 0.2381
Epoch 2/10
420/420 [==============================] - 1s 3ms/step - loss: 1.4679 - acc: 0.2333
Epoch 3/10
420/420 [==============================] - 1s 3ms/step - loss: 1.4335 - acc: 0.2667
Epoch 4/10
420/420 [==============================] - 1s 3ms/step - loss: 1.4198 - acc: 0.2310
Epoch 5/10
420/420 [==============================] - 1s 3ms/step - loss: 1.4038 - acc: 0.2524
Epoch 6/10
420/420 [==============================] - 1s 3ms/step - loss: 1.4343 - acc: 0.2643
Epoch 7/10
420/420 [==============================] - 1s 3ms/step - loss: 1.4281 - acc: 0.2786
Epoch 8/10
420/420 [==============================] - 1s 3ms/step - loss: 1.4097 - acc: 0.2333
Epoch 9/10
420/420 [==============================] - 1s 3ms/step - loss: 1.4071 - acc: 0.2714
Epoch 10/10
420/420 [==============================] - 1s 3ms/step - loss: 1.4135 - acc: 0.2476

我的模型有问题吗?我尝试更改 lr,图片的大小,尝试简化模型,更改内核大小,让它运行更多纪元(最多 60 个(并打印x_test的预测。这个预测似乎也是错误的:

error = model.predict(x_test)
print(error)
[[0.49998534 0.49998534 0.4999715  0.50000155]
 [0.49998188 0.49998283 0.49997032 0.5000029 ]
 [0.49998188 0.4999858  0.49998164 0.5000036 ]
 [0.4999795  0.49998736 0.4999841  0.5000008 ]
 [0.49998784 0.49997187 0.49996948 0.5000013 ]
 [0.49997532 0.49997967 0.49997616 0.50000024]

非常感谢各种帮助!谢谢!

我可以根据我的经验向您推荐几件事供您尝试:

  • 由于您使用的是分类交叉熵,因此您可以尝试将"softmax"作为激活函数,而不是最后一层的"sigmoid"。
  • 你应该降低你的学习率。(此处建议使用新设置(
  • 您可以尝试使用不同的优化器,例如"adam"而不是"sgd"。
  • 您可以删除辍学和批量归一化图层,并仅在必要时添加它们。
  • 使内核尺寸为 [2x2] 而不是 1。也许将内核大小从 4 更改为 (3x3(。还要减小步幅,也许您可以从 (1,1( 开始。在尺寸为 [200x200] 且步幅为 (11,11( 的图像上使用大小为 4 的内核几乎等于"什么都不学"。

请先尝试最终建议,因为这似乎是主要问题。我希望其中之一对您有所帮助。

请尝试以下设置:

  • 将步幅减少到 1*1 或 2*2 或最大 3*3
  • 删除卷积层之间的辍学,如有必要,仅在密集层之前使用辍学
  • 尝试添加池化层,最好是卷积层之后的步幅为 2*2,内核大小为 2*2。
  • 更改优化到亚当/纳达姆
  • 使用 softmax 而不是 sigmoid
  • 增加周期数,10太低。

以上所有要点都可以根据问题而有所不同,您可以尝试一下,并相应地修改模型。

由于您使用的步幅,您似乎丢失了前两层图像中几乎所有的空间信息。

您的model.summary()显示问题:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_1 (Conv2D)            (None, 19, 19, 32)        1568      
_________________________________________________________________
dropout_1 (Dropout)          (None, 19, 19, 32)        0         
_________________________________________________________________
batch_normalization_1 (Batch (None, 19, 19, 32)        128       
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 3, 3, 64)          32832     
_________________________________________________________________
dropout_2 (Dropout)          (None, 3, 3, 64)          0         
_________________________________________________________________
batch_normalization_2 (Batch (None, 3, 3, 64)          256       
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 1, 1, 128)         131200    
_________________________________________________________________
dropout_3 (Dropout)          (None, 1, 1, 128)         0         
_________________________________________________________________
batch_normalization_3 (Batch (None, 1, 1, 128)         512       
_________________________________________________________________
global_max_pooling2d_1 (Glob (None, 128)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 128)               16512     
_________________________________________________________________
dense_2 (Dense)              (None, 4)                 516       
=================================================================
Total params: 183,524
Trainable params: 183,076
Non-trainable params: 448

你看到的是张量大小立即从原始图像中的 200 下降到第一次卷积后的 19,再到第二次卷积后的 3。 我们期望尺寸会逐渐减小,以便真正利用卷积层的优势。

如果你保持代码不变,并将所有步骤更改为(2, 2),你将得到一个更合理的结构:

Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_1 (Conv2D)            (None, 100, 100, 32)      1568      
_________________________________________________________________
dropout_1 (Dropout)          (None, 100, 100, 32)      0         
_________________________________________________________________
batch_normalization_1 (Batch (None, 100, 100, 32)      128       
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 50, 50, 64)        32832     
_________________________________________________________________
dropout_2 (Dropout)          (None, 50, 50, 64)        0         
_________________________________________________________________
batch_normalization_2 (Batch (None, 50, 50, 64)        256       
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 25, 25, 128)       131200    
_________________________________________________________________
dropout_3 (Dropout)          (None, 25, 25, 128)       0         
_________________________________________________________________
batch_normalization_3 (Batch (None, 25, 25, 128)       512       
_________________________________________________________________
global_max_pooling2d_1 (Glob (None, 128)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 128)               16512     
_________________________________________________________________
dense_2 (Dense)              (None, 4)                 516       
=================================================================
Total params: 183,524
Trainable params: 183,076
Non-trainable params: 448
_________________________________________________________________

最新更新