了解用于LSTM(动态RNN)的张量输入和转换

我正在Tensorflow中构建一个LSTM风格的神经网络，在将其传递到spare_softmax_cross_entropy_with_logits层之前，我很难准确理解需要什么输入以及tf.nn.dynamic_rnn进行的后续转换。

https://www.tensorflow.org/api_docs/python/tf/nn/dynamic_rnn

理解输入

输入函数发送形式的特征张量

[批量大小，最大时间]

然而，手册规定输入张量必须采用形式

[batch_size，max_time，…]

因此，我用1d张量扩展了输入，采用的形式

[batch_size，max_time，1]

在这一点上，输入在运行时不会中断，但我不完全理解我们在这里做了什么，并怀疑这可能是计算损失时出现问题的原因(见下文)。

了解转变

这个扩展张量就是下面代码中使用的"特征"张量

LSTM_SIZE = 3
lstm_cell = rnn.BasicLSTMCell(LSTM_SIZE, forget_bias=1.0)
outputs, _ = tf.nn.dynamic_rnn(lstm_cell, features, dtype=tf.float64)
#slice to keep only the last cell of the RNN
outputs = outputs[-1]
#softmax layer
with tf.variable_scope('softmax'):
W = tf.get_variable('W', [LSTM_SIZE, n_classes], dtype=tf.float64)
b = tf.get_variable('b', [n_classes], initializer=tf.constant_initializer(0.0), dtype=tf.float64)
logits = tf.matmul(outputs, W) + b
loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=labels))

这在丢失时引发值错误

维度必须相等，但为[max_time，num_classes]和[batch_size]

来自https://www.tensorflow.org/versions/r0.12/api_docs/python/nn/classification-

一个常见的用例是拥有shape[batch_size，num_classes]的logits和shape[Back_size]的标签。但支持更高的维度

在这个过程中的某个时刻，max_time和batch_size被混淆了，我不确定它是在输入还是在LSTM期间。我很感激你的建议！

这是因为tf.nn.dynamic_rnn的输出形状https://www.tensorflow.org/api_docs/python/tf/nn/dynamic_rnn:

outputs：RNN输出张量。

如果time_major==False(默认值)，则这将是一个张量形状：[批处理大小，最大时间，单元格输出大小]。

如果time_major==True，这将是一个张量形状：[max_time，batch_size，cell.output_size].

在默认情况下，因此outputs气体形状为[batch_size, max_time, output_size]，并且在执行outputs[-1]时，获得形状为[max_time, output_size]的张量。也许用outputs[:, -1]切片应该可以修复它。

理解输入

了解转变

相关内容

最新更新

热门标签：