模型预测的张量流精度与模型预测的最终历元值精度不匹配

我正在尝试将model.predict调用的准确性与model.fit((的最终val_accurcy相匹配。我使用的是tf数据集。

val_ds = tf.keras.utils.image_dataset_from_directory(
'my_path',
validation_split=0.2,
subset="validation",
seed=38,
image_size=(SIZE,SIZE),
)

train_ds的数据集设置类似。我同时预取。。。

train_ds = train_ds.prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.prefetch(buffer_size=AUTOTUNE)

然后我得到val_ds的标签，这样我以后可以使用它们

true_categories = tf.concat([y for x, y in val_ds], axis=0)

我的型号

inputs = tf.keras.Input(shape=(SIZE, SIZE, 3))
# ... some other layers
outputs = tf.keras.layers.Dense( len(CLASS_NAMES), activation = tf.keras.activations.softmax)(intermediate)
model = tf.keras.Model(inputs, outputs)

编译精细

model.compile(
optimizer = 'adam', 
loss=tf.keras.losses.SparseCategoricalCrossentropy(), 
metrics = ['accuracy'])

看起来很适合

history = model.fit(
train_ds,
validation_data=val_ds,
epochs=10, 
class_weight=class_weights) #i do weight the classes due to imbalance

最后一个历元输出

Epoch 10:val_accurcy从0.92291没有改善176/176【=========================】-191s/步-损耗：0.9876-精度：0.7318-val_loss：0.4650-val_accurcy：0.8580

现在我想在运行model.product((时验证val_accuracy==0.8580

predictions = model.predict(val_ds, verbose=2 ) 
flattened_predictions =  predictions.argmax(axis=1)
accuracy = metrics.accuracy_score(true_categories, flattened_predictions)
print ("Accuracy = ", accuracy)

准确度=0.7980014275517487

我本以为这会等于上一个val的准确度，即0.8580，但它已经关闭了。我的val_ds使用种子，所以我在混洗时应该以相同的顺序获得图像，对吧？使用数据集获取基本事实标签是一件痛苦的事，但我认为(？？？(我的方法是正确的。

我只有两个类，当我查看我的预测变量时，看起来我得到了我所期望的概率，所以我认为我在最后一层输出中使用softmax为稀疏分类交叉熵正确地设置、编译和拟合了我的模型。

predictions[:3] #show the first 3 predictions, the values sum to 1.0 as expected

数组([[0.42447385,0.5755262]，[0.21621290.7837871]，[0.31917858，0.6808214]]，dtype=float32(

我缺少什么？

您缺少的是验证数据集在每次迭代时都会被打乱。

默认情况下，tf.keras.utils.image_dataset_from_directory具有shuffle=True。TensorFlow数据集的shuffle方法有一个参数reshuffle_each_iteration，默认为None。因此，它每次都被打乱。

seed=38参数用于跟踪分别保留用于训练和验证的样本。换句话说，使用seed参数，我们可以了解哪些样本将用于验证数据集，反之亦然。

例如：

dataset = tf.data.Dataset.range(6)
dataset = dataset.shuffle(6, reshuffle_each_iteration=None, seed=154).batch(2)
print("First time iteration:")
for x in dataset:
print(x)
print("n")
print("Second time iteration")  
for x in dataset:
print(x)

这将打印：

First time iteration:
tf.Tensor([2 1], shape=(2,), dtype=int64)
tf.Tensor([3 0], shape=(2,), dtype=int64)
tf.Tensor([5 4], shape=(2,), dtype=int64)

Second time iteration
tf.Tensor([4 3], shape=(2,), dtype=int64)
tf.Tensor([0 5], shape=(2,), dtype=int64)
tf.Tensor([2 1], shape=(2,), dtype=int64)

tf.keras.utils.image_dataset_from_directory的相关源代码可以在这里找到。

如果你想将预测与其各自的标签相匹配，那么你可以在数据集上循环：

predictions = []
labels = []
for x, y in val_ds:
predictions.append(np.argmax(model(x), axis=-1))
labels.append(y.numpy())
predictions = np.concatenate(predictions, axis=0)
labels = np.concatenate(labels, axis=0)

然后你可以检查准确性。

相关内容

最新更新

热门标签：