在TF2/Keras中正确实现Autoencoder MSE损失函数

谁能给我解释一下下面两个的区别?

假设一个具有实值输入的普通自编码器，根据这个和这个源，它的损失函数应该如下。换句话说，a)对于示例中的每个元素，我们计算平方差，b)我们对示例的所有元素执行求和，以及c)我们对所有示例取平均值。

def MSE_custom(y_true, y_pred):
return tf.reduce_mean(
0.5 * tf.reduce_sum(
tf.square(tf.subtract(y_true, y_pred)),
axis=1
)
)

然而，在大多数实现中，我看到:autoencoder.compile(loss='mse', ...).

我看不出这两者是怎么一样的。考虑这个例子:

y_true = [[0.0, 1.0, 0.0], [0.0, 0.0, 1.0], [1.0, 1.0, 1.0]]
y_pred = [[0.0, 0.8, 0.9], [0.5, 0.7, 0.6], [0.8, 0.7, 0.5]]
result1 = MSE_custom(y_true, y_pred)  # 0.355 
mse = tf.keras.losses.MeanSquaredError(reduction=tf.keras.losses.Reduction.AUTO)
result2 = mse(y_true, y_pred)  # 0.237

我错过了什么?

正如Tensorflow文档中所解释的那样，MSE是通过在张量大小(SUM损失减少)或批大小(SUM_OVER_BATCH_SIZE损失减少)上平均平方误差来计算的。下面的代码显示了如何复制两个MSE计算的一些示例。

import tensorflow as tf
y_true = [[0.0, 1.0, 0.0], [0.8, 0.9, 1.0], [1.0, 1.0, 1.0], [1.0, 0.0, 0.0]]
y_pred = [[0.0, 0.8, 0.9], [0.5, 0.7, 0.6], [0.8, 0.7, 0.5], [0.9, 0.1, 0.3]]
##############################################
# Loss reduction: "SUM"
##############################################
reduction = tf.keras.losses.Reduction.SUM
mse_1 = tf.keras.losses.MeanSquaredError(reduction=reduction)
print(mse_1(y_true, y_pred))
# tf.Tensor(0.54333335, shape=(), dtype=float32)
def MSE_1(y_true, y_pred):
x = tf.reduce_sum(tf.square(tf.subtract(y_true, y_pred)))
y = tf.cast(tf.shape(y_true)[1], tf.float32) # divide by the shape of the tensor
return tf.divide(x, y)
print(MSE_1(y_true, y_pred))
# tf.Tensor(0.54333335, shape=(), dtype=float32)
##############################################
# Loss reduction: "SUM_OVER_BATCH_SIZE"
##############################################
reduction = tf.keras.losses.Reduction.SUM_OVER_BATCH_SIZE
mse_2 = tf.keras.losses.MeanSquaredError(reduction=reduction)
print(mse_2(y_true, y_pred))
# tf.Tensor(0.13583334, shape=(), dtype=float32)
def MSE_2(y_true, y_pred):
x = tf.reduce_sum(tf.square(tf.subtract(y_true, y_pred)))
y = tf.cast(tf.multiply(tf.shape(y_true)[0], tf.shape(y_true)[1]), tf.float32) # divide by the size of the tensor
return tf.divide(x, y)
print(MSE_2(y_true, y_pred))
# tf.Tensor(0.13583334, shape=(), dtype=float32)
##############################################
# Loss reduction: "NONE"
##############################################
reduction = tf.keras.losses.Reduction.NONE
mse_3 = tf.keras.losses.MeanSquaredError(reduction=reduction)
print(mse_3(y_true, y_pred))
# tf.Tensor([0.28333333 0.09666666 0.12666667 0.03666667], shape=(4,), dtype=float32)
def MSE_3(y_true, y_pred):
x = tf.reduce_sum(tf.square(tf.subtract(y_true, y_pred)), axis=1)
y = tf.cast(tf.shape(y_true)[1], tf.float32) # divide by the shape of the tensor
return tf.divide(x, y)
print(MSE_3(y_true, y_pred))
# tf.Tensor([0.28333333 0.09666666 0.12666667 0.03666667], shape=(4,), dtype=float32)
# recover "SUM" loss reduction
print(tf.reduce_sum(mse_3(y_true, y_pred)))
# tf.Tensor(0.54333335, shape=(), dtype=float32)
print(tf.reduce_sum(MSE_3(y_true, y_pred)))
# tf.Tensor(0.54333335, shape=(), dtype=float32)
# recover "SUM_OVER_BATCH_SIZE" loss reduction
print(tf.divide(tf.reduce_sum(mse_3(y_true, y_pred)), tf.cast(tf.shape(y_true)[0], tf.float32)))
# tf.Tensor(0.13583334, shape=(), dtype=float32)
print(tf.divide(tf.reduce_sum(MSE_3(y_true, y_pred)), tf.cast(tf.shape(y_true)[0], tf.float32)))
# tf.Tensor(0.13583334, shape=(), dtype=float32)

有两个区别。

Keras损失在所有维度上的平均值，即您的reduce_sum应由reduce_mean代替。
Keras损失不乘以0.5。

在您的示例中，您有三个维度，因此我们可以通过除以3(模拟平均)并乘以2来从结果中得到Keras损失。事实证明，0.355 * 2/3 == 0.237(大致).

这些变化可能会让你感到困惑，但它们最终是无关紧要的，因为除以N和乘以2都是常数因子，因此只会为梯度提供常数因子。

编辑:下面的计算应该会给出与Keras损失相同的结果:

mse_custom = tf.reduce_mean((y_true - y_pred)**2)

为了简单起见，我使用了重载的Python操作符而不是TF操作符(减去/平方)。这只是一次平均整个2D矩阵，这与计算轴1上的平均值，然后在轴0上平均相同。

相关内容

最新更新

热门标签：