Custom Model train_step永远不会被调用



我想根据keras文档中的这个例子构建和训练一个语言模型

我的MaskedLanguageModel类看起来像这样:


import tensorflow as tf
from tensorflow import keras
loss_fn = keras.losses.SparseCategoricalCrossentropy(
reduction=tf.keras.losses.Reduction.NONE
)
loss_tracker = tf.keras.metrics.Mean(name="loss")

class MaskedLanguageModel(tf.keras.Model):
def train_step(self, inputs):
if len(inputs) == 3:
features, labels, sample_weight = inputs
else:
features, labels = inputs
sample_weight = None
with tf.GradientTape() as tape:
predictions = self(features, training=True)
loss = loss_fn(labels, predictions, sample_weight=sample_weight)
# Compute gradients
trainable_vars = self.trainable_variables
gradients = tape.gradient(loss, trainable_vars)
# Update weights
self.optimizer.apply_gradients(zip(gradients, trainable_vars))
# Compute our own metrics
loss_tracker.update_state(loss, sample_weight=sample_weight)
# Return a dict mapping metric names to current value
return {"loss": loss_tracker.result()}
@property
def metrics(self):
# We list our `Metric` objects here so that `reset_states()` can be
# called automatically at the start of each epoch
# or at the start of `evaluate()`.
# If you don't implement this property, you have to call
# `reset_states()` yourself at the time of your choosing.
return [loss_tracker]

我面临的问题是,train_step函数从未被调用。而我的损失总是&;0.0000e+00&;

但是当我注释掉下面的部分时,损失被更新并且变得更小。


@property
def metrics(self):
# We list our `Metric` objects here so that `reset_states()` can be
# called automatically at the start of each epoch
# or at the start of `evaluate()`.
# If you don't implement this property, you have to call
# `reset_states()` yourself at the time of your choosing.
return [loss_tracker]

根据这个答案,train_step应该在通过.fit(…)训练模型时自动调用

我做错了什么?

(我的tensorflow版本是2.1.0)

提前谢谢你。

我已经在最新版本的tensorflow(2.10.0)中使用下面提到的训练步骤实现了该模型

loss_fn = keras.losses.SparseCategoricalCrossentropy(
reduction=tf.keras.losses.Reduction.NONE
)
loss_tracker = tf.keras.metrics.Mean(name="loss")

class MaskedLanguageModel(tf.keras.Model):
def train_step(self, inputs):
if len(inputs) == 3:
features, labels, sample_weight = inputs
else:
features, labels = inputs
sample_weight = None
with tf.GradientTape() as tape:
predictions = self(features, training=True)
loss = loss_fn(labels, predictions, sample_weight=sample_weight)
# Compute gradients
trainable_vars = self.trainable_variables
gradients = tape.gradient(loss, trainable_vars)
# Update weights
self.optimizer.apply_gradients(zip(gradients, trainable_vars))
# Compute our own metrics
loss_tracker.update_state(loss, sample_weight=sample_weight)
# Return a dict mapping metric names to current value
return {"loss": loss_tracker.result()}
@property
def metrics(self):
# We list our `Metric` objects here so that `reset_states()` can be
# called automatically at the start of each epoch
# or at the start of `evaluate()`.
# If you don't implement this property, you have to call
# `reset_states()` yourself at the time of your choosing.
return [loss_tracker]

很好。请参考此工作要点。谢谢你。

最新更新