为什么增加Epoch后损失函数输出NaN



当我通过文本学习TensorFlow(版本1(时,我遇到了以下问题:

# Generate data
xy, labels = make_circles(n_samples=200, noise=0.1, random_state=717)
features = xy
num_hidden1 = 10
num_hidden2 = 5
x = tf.placeholder(tf.float32, shape=(None, 2))
y = tf.placeholder(tf.float32, shape=(None, 1))
rand_init = tf.random_normal_initializer(seed=624)
# Hidden Layer 1
hidden1 = tf.contrib.layers.fully_connected(x, num_hidden1, 
activation_fn=tf.nn.sigmoid,
weights_initializer=rand_init,
biases_initializer=rand_init)
# Hidden Layer 2
hidden2 = tf.contrib.layers.fully_connected(hidden1, num_hidden2,
activation_fn=tf.nn.sigmoid,
weights_initializer=rand_init,
biases_initializer=rand_init)
# Output Layer
yhat =  tf.contrib.layers.fully_connected(hidden2, 1,
activation_fn=tf.nn.sigmoid,
weights_initializer=rand_init,
biases_initializer=rand_init)
loss = tf.reduce_mean(-y * tf.log(yhat) - (1-y) * tf.log(1-yhat))
# Prepare algorithm
MaxEpochs = 2500
lr = 0.1
optimizer = tf.train.AdamOptimizer(lr)
train = optimizer.minimize(loss)
# Shuffle data
np.random.seed(7382)
idx = np.arange(0, len(features))
np.random.shuffle(idx)
shuffled_features = features[idx]
shuffled_labels = labels[idx]
# Stochastic method
batch_size = 25
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
for epoch in range(MaxEpochs):
if epoch % 500 == 0:
loss_val = sess.run(loss, feed_dict={x: features, y: labels.reshape(-1,1)})
plot_model(sess, yhat, xy, labels, f_fn, 'Epoch {}n (loss={:1.2f})'.format(epoch, loss_val))
for x_batch, y_batch in generate_batches(batch_size, shuffled_features, shuffled_labels):
sess.run(train, feed_dict={x: x_batch, y: y_batch.reshape(-1,1)})
loss_val = sess.run(loss, feed_dict={x: features, y: labels.reshape(-1,1)})
print(loss_val)
plot_model(sess, yhat, xy, labels, f_fn, 'Epoch {}n (loss={:1.2f})'.format(epoch+1, loss_val))
/usr/local/lib/python3.6/dist-packages/matplotlib/contour.py:1483: UserWarning: Warning: converting a masked element to nan.self.zmax = float(z.max())
/usr/local/lib/python3.6/dist-packages/matplotlib/contour.py:1134: RuntimeWarning: invalid value encountered in greater over = np.nonzero(lev > self.zmax)[0]

当用MaxEpoch=20进行测试时,该示例工作良好,并且为了显示用loss=0.18进行过拟合的情况,该示例增加到2500;然而,当我运行它时,loss函数在大约400 epoch之后开始输出NaN。

文本中的例子过时了吗?或者这应该是一个错误吗?

当你运行更长的迭代时,你应该会降低你的学习率。对于2500个时期来说,你0.1的学习率很可能太高了。您可以尝试较低的lr并运行2500个历元,以证明对训练数据的过度拟合。

相关内容

  • 没有找到相关文章

最新更新