未经dropout正则化的模型val_accuracy高于测试精度



我最近创建了一个包含810个训练图像和810个测试图像(27个类)的机器学习来识别美国手语手势。我使用SGD优化器训练这个模型,学习率为0.001,5个epoch,分类交叉熵损失。然而,我的验证精度比我的模型测试精度高20%左右,我不确定为什么。我试过调整我的模型结构、优化器、学习率和时间——这从来没有改变过。有人有什么想法吗?下面是我的模型代码:

import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
model = tf.keras.models.Sequential([

tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(150, 150, 3)),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),

tf.keras.layers.Conv2D(128, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),

tf.keras.layers.Conv2D(128, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),

tf.keras.layers.Flatten(),
tf.keras.layers.Dense(512, activation='relu'),

tf.keras.layers.Dense(27, activation='softmax')
])
model.summary()
from tensorflow.keras.optimizers import SGD
sgd = SGD(learning_rate=0.001, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss = 'categorical_crossentropy',
optimizer = sgd,
metrics = ['accuracy'])
class myCallback(tf.keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs={}):
if(logs.get('accuracy')>0.95):
print("nReached >95% accuracy so cancelling training!")
self.model.stop_training = True

callbacks = myCallback()
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=40,
width_shift_range=0.2, # Shifting image width by 20%
height_shift_range=0.2,# Shifting image height by 20%
shear_range=0.2,       # Shearing across X-axis by 20%
zoom_range=0.2,        # Image zooming by 20%
horizontal_flip=True,
fill_mode='nearest')
train_generator = train_datagen.flow_from_directory(
"/content/drive/MyDrive/train_asl",
target_size = (150, 150),
class_mode = 'categorical',
batch_size = 5)
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_directory(
"/content/drive/MyDrive/test_asl",
target_size = (150, 150),
class_mode = 'categorical',
batch_size = 5
)
import numpy as np
history = model.fit_generator(
train_generator,
steps_per_epoch = np.ceil(810/5),  # 2520 images = batch_size * steps
epochs = 100,
validation_data=validation_generator,
validation_steps = np.ceil(810/5),  # 372 images = batch_size * steps
callbacks=[callbacks],
verbose = 2)

如果验证和测试数据的底层数据分布不同或不平衡,则验证和测试精度可能不同。

您可以使用分层抽样技术确保27个类的类分布在验证集和测试集中大致相同。您还可以在验证和测试期间检查输入数据分布是否相同/不同,因为810张图像并不多,特别是如果您不使用迁移学习。

最新更新