张量流:某些输入数据中的成本函数输出 0

嗨，我在研究机器学习。我尝试了使用神经网络进行softmax分类。在学习进度中，标签 1,2 表示良好的学习状态。但在标签 3 处，成本输出始终为 0.0。我想我对神经网络并不完全了解。

我正在尝试制作学习模型。input_sequence_length = 3， output_class = 3

0 <= 输入 <= 2 结果 = 1

3 <= 输入 <= 5 结果 = 2

6 <= 输入 <= 8 结果 = 3

请让我知道我错过了什么。

以下源代码是部分代码。

输入数据（0~2 -> 1， 3~5 -> 2， 6~8 -> 3） 1,2,3 = 标签

0   2   0   1
5   4   5   2
7   6   8   3
2   2   0   1
5   3   4   2
7   6   7   3

输出

1. input X : [[0, 2, 0]] Y(label) : [[1]]
cost : 1.25544  hypothesis : [0.30000001, 0.28, 0.41]
2. input X : [[5, 4, 5]] Y(label) : [[2]]
cost : 1.10084 hypothesis : [0.31, 0.36000001, 0.33000001]
3. input X : [[7, 6, 8]] Y(label) : [[3]]
cost : 0.0  hypothesis : [0.28, 0.25, 0.47999999]
4. input X : [[2, 2, 0]] Y(label) : [[1]]
cost : 1.22364  hypothesis : [0.27000001, 0.28999999, 0.44]
5. input X : [[5, 3, 4]] Y(label) : [[2]]
cost : 0.961203  hypothesis : [0.30000001, 0.31999999, 0.38]
6. input X : [[7, 6, 7]] Y(label) : [[3]]
cost : 0.0 hypothesis : [0.27000001, 0.23999999, 0.49000001]

源代码

batch_size = 1
input_sequence_length = 3 
output_sequence_length = 1
input_num_classes = 9
output_num_classes = 3
hidden_size = 12
learning_rate = 0.1

with tf.name_scope("placeholder") as scope:
    X = tf.placeholder(tf.int32, [None, input_sequence_length], name="x_input")
    X_one_hot = tf.one_hot(X, input_num_classes)
    Y = tf.placeholder(tf.int32, [None, output_sequence_length], name="y_input")  # 1
    Y_one_hot = tf.one_hot(Y, output_num_classes)  # one hot
    Y_one_hot = tf.reshape(Y_one_hot, [-1, output_num_classes])
    X_one_hot = tf.reshape(X_one_hot, [batch_size , input_sequence_length * input_num_classes])
    outputs = tf.to_float(X_one_hot)

with tf.name_scope("Layer_1") as scope:
    W1 = tf.Variable(tf.random_normal([input_sequence_length * input_num_classes, **strong text**hidden_size]), name='weight1')
    b1 = tf.Variable(tf.random_normal([hidden_size]), name='bias1')
    outputs = tf.sigmoid(tf.matmul(outputs, W1) + b1)

with tf.name_scope("Layer_2") as scope:
    W2 = tf.Variable(tf.random_normal([hidden_size, output_num_classes]), name='weight2')
    b2 = tf.Variable(tf.random_normal([output_num_classes]), name='bias2')
    logits = tf.sigmoid(tf.matmul(outputs, W2) + b2)

with tf.name_scope("hypothesis") as scope:
    hypothesis = tf.nn.softmax(logits)
with tf.name_scope("cost") as scope:
    # Cross entropy cost/loss
    cost_i = tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=Y_one_hot)
    cost = tf.reduce_mean(cost_i)
with tf.name_scope("train") as scope:
    optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost)

tf.one_hot的第一个参数（indices）是零索引的。

tf.one_hot(3,3)会产生[0, 0, 0]，导致交叉熵误差为0。

相关内容

最新更新

热门标签：