您好,我的训练数据在标签中有很多缺失值,例如单个标签可以具有以下值:
[nan, 0, 0, nan, 1, 0]
我想训练一个忽略 nan 值的分类模型。目前我已经用 -1 填充了 nan 值,并尝试对其进行切片。掩码不起作用,因为分类交叉熵仍然考虑到它
ix = tf.where(tf.not_equal(y_true, -1))
true = tf.gather(y_true, ix)
pred = tf.gather(y_pred, ix)
return keras.objectives.categorical_crossentropy(true, pred)
是我到目前为止能够想出的,但它错误
InvalidArgumentError (see above for traceback): Incompatible shapes: [131] vs. [128]
[[Node: mul_1 = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"](Mean, _recv_dense_3_sample_weights_0/_13)]]
有人知道如何处理这个问题吗?
您可以编写一个自定义损失函数,并暂时将缺失值替换为零。然后在计算交叉熵损失后,用零替换标签缺失位置的损失值。
import numpy as np
import tensorflow as tf
tf.enable_eager_execution()
def missing_values_cross_entropy_loss(y_true, y_pred):
# We're adding a small epsilon value to prevent computing logarithm of 0 (consider y_hat == 0.0 or y_hat == 1.0).
epsilon = tf.constant(1.0e-30, dtype=np.float32)
# Check that there are no NaN values in predictions (neural network shouldn't output NaNs).
y_pred = tf.debugging.assert_all_finite(y_pred, 'y_pred contains NaN')
# Temporarily replace missing values with zeroes, storing the missing values mask for later.
y_true_not_nan_mask = tf.logical_not(tf.math.is_nan(y_true))
y_true_nan_replaced = tf.where(tf.math.is_nan(y_true), tf.zeros_like(y_true), y_true)
# Cross entropy, but split into multiple lines for readability:
# y * log(y_hat)
positive_predictions_cross_entropy = y_true_nan_replaced * tf.math.log(y_pred + epsilon)
# (1 - y) * log(1 - y_hat)
negative_predictions_cross_entropy = (1.0 - y_true_nan_replaced) * tf.math.log(1.0 - y_pred + epsilon)
# c(y, y_hat) = -(y * log(y_hat) + (1 - y) * log(1 - y_hat))
cross_entropy_loss = -(positive_predictions_cross_entropy + negative_predictions_cross_entropy)
# Use the missing values mask for replacing loss values in places in which the label was missing with zeroes.
# (y_true_not_nan_mask is a boolean which when casted to float will take values of 0.0 or 1.0)
cross_entropy_loss_discarded_nan_labels = cross_entropy_loss * tf.cast(y_true_not_nan_mask, tf.float32)
mean_loss_per_row = tf.reduce_mean(cross_entropy_loss_discarded_nan_labels, axis=1)
mean_loss = tf.reduce_mean(mean_loss_per_row)
return mean_loss
y_true = tf.constant([
[0, 1, np.nan, 0],
[0, 1, 1, 0],
[np.nan, 1, np.nan, 0],
[1, 1, 0, np.nan],
])
y_pred = tf.constant([
[0.1, 0.7, 0.1, 0.3],
[0.2, 0.6, 0.1, 0],
[0.1, 0.9, 0.3, 0.2],
[0.1, 0.4, 0.4, 0.2],
])
loss = weighted_cross_entropy_loss(y_true, y_pred)
# Extract value from EagerTensor.
print(loss.numpy())
输出:
0.4945919
编译文档中指定的 keras 模型时,请使用损失函数:
model.compile(loss=missing_values_cross_entropy_loss, optimizer='sgd')