如何处理np.数组作为图像生成器中的训练集



我正在做一个ML模型,它从numpy数组中获取像素值作为训练和测试数据。我定义了一个将数据集划分为图像和标签的函数。我的任务是使用Image Generator进行数据增强,然后训练模型。一切都很顺利,直到我开始训练模型。它一直给我所用损失函数的误差。当我使用categorical_crossentropy时,它说我可以使用'sparse_categorical_crossentropy'或使用函数to_categorical。好吧,我尝试了两个,仍然有错误,所以我决定尝试使用tf.convert_to_tensor()在我的标签,但现在我得到一个形状错误:

ValueError: A target array with shape (126, 25, 2) was passed for an output of shape (None, 3) while using as loss `categorical_crossentropy`. This loss expects targets to have the same shape as the output.

这是我的代码:

training_labels = tf.convert_to_tensor(training_labels)
testing_labels = tf.convert_to_tensor(testing_labels)

# Create an ImageDataGenerator and do Image Augmentation
train_datagen = ImageDataGenerator(
rescale = 1./255,
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest'
)
validation_datagen = ImageDataGenerator(rescale = 1./255)

train_generator = train_datagen.flow(training_images, 
training_labels,
batch_size=126
)
validation_generator = validation_datagen.flow(
testing_images, 
testing_labels,
batch_size=126
)

# Keep These
print(training_images.shape)
print(testing_images.shape)

# Their output should be:
# (27455, 28, 28, 1)
# (7172, 28, 28, 1)

模型是这样的:

model = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(64, (3,3), activation='relu', input_shape=(28, 28, 1)),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Flatten(),
tf.keras.layers.Dropout(0.5),
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dense(3, activation='softmax')
])
# Compile Model. 
model.compile(loss = 'categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
# Train the Model
history = model.fit_generator(train_generator, validation_data=validation_generator, epochs=2)

model.evaluate(testing_images, testing_labels, verbose=0)

我被它困住了,我搜索解决方案,但没有成功。你能帮我想个办法吗?

非常感谢!

当使用分类交叉熵作为损失函数时,标签应该是一个热编码的,因此最后一层中存在的神经元数量应该等于数据集中存在的类的数量,这就是你得到的错误。由于输出神经元的数量是3,我猜你有3个类,因此training_labels/testing_labels的形状应该是(num of images in train/test, 3)。

下面是cifar数据集的一小段代码:

from tf.keras.utils import to_categorical
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
num_classes = 10 
# convert to one hot encoding 
# shape will be (60000, 10) since there are 60000 images and 10 classes in cifar
y_train = to_categorical(y_train, num_classes)
# shape will be (10000, 10) since there are 10000 images and 10 classes in cifar
y_test = to_categorical(y_test, num_classes)
datagen = ImageDataGenerator(
featurewise_center=True,
featurewise_std_normalization=True,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True)
# compute quantities required for featurewise normalization
# (std, mean, and principal components if ZCA whitening is applied)
datagen.fit(x_train)
history = model.fit_generator(datagen.flow(x_train, y_train, batch_size=32), epochs=2)