如何在Tensorflow中使用测试集加载和评估CNN



我正试图在一组图像上训练CNN。有两个文件夹:training_set和testronget,每个文件夹包含2个类。它们看起来像这样:

training_set/
classA/
img1.png
img2.png
...
classB/
img1.png
img2.png
...
test_set/
classA/
img1.png
img2.png
...
classB/
img1.png
img2.png
...

代码如下所示,其中训练集被拆分为训练和验证集:

import os
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.python.client import device_lib 
import numpy as np
import matplotlib.pyplot as plt
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
print(device_lib.list_local_devices())
# Set image properties
img_height = 369
img_width = 496
batch_size = 32
# Import data set from directory
train_images = tf.keras.preprocessing.image_dataset_from_directory(
"path_to_training_set",
labels='inferred',
label_mode="binary", # not sure about this one though, as the classes are not called '0' and '1'
class_names = ['classA', 'classB'],
color_mode =  'rgb',
batch_size = batch_size,
image_size = (img_height, img_width),
shuffle = True,
seed = 123,
validation_split = 0.2,
subset = "training"
)
val_images = tf.keras.preprocessing.image_dataset_from_directory(
"path_to_training_set",
labels='inferred',
label_mode="binary", # not sure about this one though, as the classes are not called '0' and '1'
class_names = ['classA', 'classB'],
color_mode =  'rgb',
batch_size = batch_size,
image_size = (img_height, img_width),
shuffle = True,
seed = 123,
validation_split = 0.2,
subset = "validation"
)

然后:

from matplotlib import pyplot
img_height = 369
img_width = 496
epochs = 25
model = tf.keras.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(img_height, img_width, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
# Since we have two classes:
model.add(layers.Dense(1, activation='sigmoid'))
# BinaryCrossentropy because there are 2 classes 
optimizer = tf.keras.optimizers.Adam(learning_rate=0.0001)
model.compile(optimizer=optimizer, loss=tf.keras.losses.BinaryCrossentropy(from_logits=False), metrics=['accuracy'])
# Feed the model
history = model.fit(train_images, epochs=epochs, batch_size=32, verbose=1, validation_data=val_images)
# Plot
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs_range = range(epochs)
plt.figure(figsize=(8, 8))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')
plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()

现在已经对模型进行了训练,它显示了训练和验证的准确性和损失图。我尝试使用加载我的测试集

test_images = tf.keras.preprocessing.image_dataset_from_directory(
"path_to_test_set",
labels='inferred',
label_mode="binary",
class_names = ['classA', 'classB'],
color_mode =  'rgb',
batch_size = batch_size, # not really applicable as I want to use the whole set?
image_size = (img_height, img_width),
shuffle = True,
seed = 123,
validation_split = None
)

但这是正确的方式吗?如何处理batch_size?我想我会用我的测试集评估模型,使用:

test_loss, test_acc = model.evaluate(test_images, verbose=2)
print('nTest accuracy:', test_acc)

但我认为这还不够,因为我想要准确度、准确度、回忆力和F1成绩。我甚至不确定这里是否发生了正确的事情(测试集是如何加载的(。

所以基本上:我如何加载我的测试集并计算准确性、准确度、召回率和F1分数?

您需要对数据进行迭代,然后才能收集预测和真实类。

predicted_probs = np.array([])
true_classes =  np.array([])
for images, labels in test_images:
predicted_probs = np.concatenate([predicted_probs,
model(images)])
true_classes = np.concatenate([true_classes, labels.numpy()])

由于它们是sigmoid输出,您需要将它们转换为具有阈值的类,即0.5:

predicted_classes = [1 * (x[0]>=0.5) for x in predicted_probs]

之后你可以得到混淆矩阵等:

conf_matrix = tf.math.confusion_matrix(true_classes, predicted_classes)

相关内容

最新更新