在 Tensorflow 中构建 LSTM RNN 时的维度不匹配



我正在尝试在Tensorflow中构建多层,多类,多标签LSTM。我一直在尝试将本教程弯曲到我的数据。

但是,我收到一个错误,说我在构建 RNN 时维度不匹配。

ValueError:维度必须相等,但对于输入形状为 [?,1000]、[923,2000]、[923,2000]的"rnn/while/rnn/multi_rnn_cell/cell_0/lstm_cell/MatMul_1"(op: 'MatMul'),维度必须相等。

我无法确定建筑架构中的哪个变量不正确:

def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)

def bias_variable(shape):
initial = tf.constant(0.0, shape=shape)
return tf.Variable(initial)

def lstm(x, weight, bias, n_steps, n_classes):
cell = rnn_cell.LSTMCell(cfg.n_hidden_cells_in_layer, state_is_tuple=True)
multi_layer_cell = tf.nn.rnn_cell.MultiRNNCell([cell] * 2)
# FIXME : ERROR binding x to LSTM as it is
output, state = tf.nn.dynamic_rnn(multi_layer_cell, x, dtype=tf.float32)
# FIXME : ERROR
output_flattened = tf.reshape(output, [-1, cfg.n_hidden_cells_in_layer])
output_logits = tf.add(tf.matmul(output_flattened, weight), bias)
output_all = tf.nn.sigmoid(output_logits)
output_reshaped = tf.reshape(output_all, [-1, n_steps, n_classes])
# ??? switch batch size with sequence size. ???
# then gather last time step values
output_last = tf.gather(tf.transpose(output_reshaped, [1, 0, 2]), n_steps - 1)

return output_last, output_all

这些是我的占位符,损失函数和所有爵士乐:

x_test, y_test = load_multiple_vector_files(test_filepaths)
x_valid, y_valid = load_multiple_vector_files(valid_filepaths)
n_input, n_steps, n_classes = get_input_target_lengths(check_print=False)

# FIXME n_input should be the problem
x = tf.placeholder("float", [None, n_steps, n_input])
y = tf.placeholder("float", [None, n_classes])
y_steps = tf.placeholder("float", [None, n_classes])
weight = weight_variable([cfg.n_hidden_layers, n_classes])
bias = bias_variable([n_classes])
y_last, y_all = lstm(x, weight, bias, n_steps, n_classes)
#all_steps_cost=tf.reduce_mean(-tf.reduce_mean((y_steps * tf.log(y_all))+(1 - y_steps) * tf.log(1 - y_all),reduction_indices=1))
all_steps_cost = -tf.reduce_mean((y_steps * tf.log(y_all)) + (1 - y_steps) * tf.log(1 - y_all))
last_step_cost = -tf.reduce_mean((y * tf.log(y_last)) + ((1 - y) * tf.log(1 - y_last)))
loss_function = (cfg.alpha * all_steps_cost) + ((1 - cfg.alpha) * last_step_cost)
optimizer = tf.train.AdamOptimizer(learning_rate=cfg.learning_rate).minimize(loss_function)

我很确定是我的X 占位符导致了问题,导致图层及其矩阵尺寸不匹配。链接示例使用的常量很难看出它实际上代表什么。

谁能在这里帮我?:)

更新:我对不匹配的尺寸进行了"有根据的猜测"。 一个是 2*hidden_width,所以隐藏获取新输入 + 旧的循环输入。然而,不匹配的维度是input_width + hidden_width,就像它试图为输入层的隐藏层宽度设置重新货币一样。

我发现我错误地设置了权重变量,使用 n_hidden_layers(隐藏层数)而不是 n_hidden_cells_in_layer(层数)的常量。

最新更新