第二个纪元的初始损失与第一个纪元的最终损失并不一致。损耗和精度在每个时期保持不变



第二个损失与第一个历元并不一致。在那之后,每一次最初的损失在每一个时代都保持不变。所有这些参数保持不变。我有一些深度学习的背景,但这是我第一次实现自己的模型,所以我想直观地知道我的模型出了什么问题。数据集是具有两个分类的裁剪人脸,每个分类具有300张图片。我非常感谢你的帮助。

import tensorflow as tf
from tensorflow import keras
from IPython.display import Image
import matplotlib.pyplot as plt
from keras.layers import ActivityRegularization
from keras.layers import Dropout
from tensorflow.keras.preprocessing.image import ImageDataGenerator
image_generator = ImageDataGenerator(
featurewise_center=False, samplewise_center=False,
featurewise_std_normalization=False, samplewise_std_normalization=False,
rotation_range=0, width_shift_range=0.0, height_shift_range=0.0,
brightness_range=None, shear_range=0.0, zoom_range=0.0, channel_shift_range=0.0,
horizontal_flip=False, vertical_flip=False, rescale=1./255
)
image = image_generator.flow_from_directory('./util/untitled folder',batch_size=938)
x, y = image.next()
x_train = x[:500]
y_train = y[:500]
x_test = x[500:600]
y_test = y[500:600]
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train)).batch(4)
test_dataset = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(4)
plt.imshow(x_train[0])
def convolutional_model(input_shape):
input_img = tf.keras.Input(shape=input_shape)
x = tf.keras.layers.Conv2D(64, (7,7), padding='same')(input_img)
x = tf.keras.layers.BatchNormalization(axis=3)(x)
x = tf.keras.layers.ReLU()(x)
x = tf.keras.layers.MaxPool2D(pool_size=(2, 2), strides=1, padding='same')(x)
x = Dropout(0.5)(x)
x = tf.keras.layers.Conv2D(128, (3, 3), padding='same', strides=1)(x)
x = tf.keras.layers.ReLU()(x)
x = tf.keras.layers.MaxPool2D(pool_size=(2, 2), padding='same', strides=4)(x)
x = tf.keras.layers.Flatten()(x)
x = ActivityRegularization(0.1,0.2)(x)
outputs = tf.keras.layers.Dense(2, activation='softmax')(x)

model = tf.keras.Model(inputs=input_img, outputs=outputs)
return model

conv_model = convolutional_model((256, 256, 3))
conv_model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.SGD(lr=1),
metrics=['accuracy'])
conv_model.summary()
conv_model.fit(train_dataset,epochs=100, validation_data=test_dataset)

Epoch 1/100
2021-12-23 15:06:22.165763: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
2021-12-23 15:06:22.172255: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
125/125 [==============================] - ETA: 0s - loss: 804.6805 - accuracy: 0.48602021-12-23 15:06:50.936870: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
125/125 [==============================] - 35s 275ms/step - loss: 804.6805 - accuracy: 0.4860 - val_loss: 0.7197 - val_accuracy: 0.4980
Epoch 2/100
125/125 [==============================] - 34s 270ms/step - loss: 0.7360 - accuracy: 0.4820 - val_loss: 0.7197 - val_accuracy: 0.4980
Epoch 3/100
125/125 [==============================] - 34s 276ms/step - loss: 0.7360 - accuracy: 0.4820 - val_loss: 0.7197 - val_accuracy: 0.4980


由于您有一个恒定的损失+准确性,您的网络很可能什么都没学到(因为您有两个类,它总是预测其中一个(。

最后一层的激活函数、损失函数和神经元数量是正确的。

这个问题与你加载图像的方式无关,而是与学习率1有关。在如此高的学习率下,网络不可能学习到任何东西。

您应该从一个小得多的学习率开始,例如0.00010.00001,然后在性能仍然较差的情况下尝试调试数据加载过程。

我很确定这与如何加载数据有关,更具体地说,与x, y = image.next()部分有关。如果您能够将./util/untitled folder中的数据拆分到分别具有训练和验证数据的单独文件夹中,则可以在管道上使用与Tensorflow页面上的示例部分相同的类型:

train_datagen = ImageDataGenerator(
featurewise_center=False, 
samplewise_center=False,
featurewise_std_normalization=False, 
samplewise_std_normalization=False,
rotation_range=0, 
width_shift_range=0.0, 
height_shift_range=0.0,
brightness_range=None, 
shear_range=0.0, 
zoom_range=0.0, 
channel_shift_range=0.0,
horizontal_flip=False, 
vertical_flip=False, 
rescale=1./255)
test_datagen = ImageDataGenerator(featurewise_center=False, 
samplewise_center=False,
featurewise_std_normalization=False, 
samplewise_std_normalization=False,
rotation_range=0, 
width_shift_range=0.0, 
height_shift_range=0.0,
brightness_range=None, 
shear_range=0.0, 
zoom_range=0.0, 
channel_shift_range=0.0,
horizontal_flip=False, 
vertical_flip=False, 
rescale=1./255)
train_generator = train_datagen.flow_from_directory(
'data/train',
target_size=(256, 256),
batch_size=4)
validation_generator = test_datagen.flow_from_directory(
'data/validation',
target_size=(256, 256),
batch_size=4)
model.fit(
train_generator,
epochs=100,
validation_data=validation_generator)