皮尔逊相关作为损失函数

我正在训练一个TF前馈网络，我的目标是产生从0到1的预测，这些预测尽可能接近目标分数。单个训练实例由大约 450 个特征组成，数据集中大约有 1500 个示例。我在网络中使用 4 层，每层都有一个 Relu 激活，然后最后一个"out"层有一个 sigmoid 激活。当我使用 MSE 作为损失函数时，我得到了不错的（但不是最佳）结果。我尝试使用以下函数作为损失函数：

# Define loss and optimizer
#pearson correlation as loss function
length = 443
#apply regularization (l2)
Beta = 0.01
regularizer = tf.nn.l2_loss(weights['h1']) +   
tf.nn.l2_loss(weights['h2']) + tf.nn.l2_loss(weights['h3']) + 
tf.nn.l2_loss(weights['h4'])
#used to report correlation 
pearson = tf.contrib.metrics.streaming_pearson_correlation(intensity, 
Y, name="pearson")

#pearson corr. as loss?
# multiply by -1 to maximize correlation i.e. minimize negative 
correlation 
original_loss = -1 * length * tf.reduce_sum(tf.multiply(intensity, Y)) 
- (tf.reduce_sum(intensity) * tf.reduce_sum(Y))
divisor = tf.sqrt(
        (length * tf.reduce_sum(tf.square(intensity)) - 
tf.square(tf.reduce_sum(intensity)))) *
        tf.sqrt(
        length * tf.reduce_sum(tf.square(Y)) - 
tf.square(tf.reduce_sum(Y)))
 loss_op = tf.truediv(original_loss, divisor)
 loss_op = tf.reduce_mean(loss_op + Beta * regularizer)
#Init optimizer
 optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate, 
 epsilon = 1e-09)
 train_op = optimizer.minimize(loss_op)

这个想法是最小化负相关，即最大化正相关。然而，在对超参数进行了大量实验之后，这仍然给了我"nan"错误并报告了"nan"皮尔逊相关性。关于为什么会这样的任何想法？

请注意，tf.contrib.metrics.streaming_pearson_correlation() 返回一个元组(pearson_r, update_op)，因此原则上您应该能够将update_op直接馈送到 optimizer.minimize() 中。

相关内容

最新更新

热门标签：