使用递归神经网络构建语言模型



当我运行代码时,我得到这个错误。

异常:输入数组应该与目标数组具有相同数量的样本。发现12196个输入样本和1个目标样本

下面是我训练的模型。

from keras.models import Sequential
from keras.layers.core import Dense
from keras.utils import np_utils
from keras.layers.embeddings import Embedding
from keras.layers.recurrent import LSTM
from keras.regularizers import l2
from keras.layers.wrappers import TimeDistributed
n_in = x_train.shape[1]
n_hidden = 100
n_out = word_vecs.shape[0]
number_of_epochs = 10
batch_size = 35
model = Sequential()
model.add(Embedding(output_dim=word_vecs.shape[1],                 input_dim=word_vecs.shape[0],input_length=n_in,  weights=[word_vecs],  mask_zero=True))  
model.add(LSTM(n_hidden, W_regularizer=l2(0.0001), U_regularizer=l2(0.0001), return_sequences=True))
model.add(TimeDistributed(Dense(n_out, activation='softmax', W_regularizer=l2(0.0001))))

model.compile(loss='categorical_crossentropy', optimizer='rmsprop')

我还编码了我的火车数据的一个热向量。

下面是代码

new_instance = []
for instance in train_y :
    new_vector = np.zeros(shape=(instance.shape[0],  word_vecs.shape[0]))
    print(instance.shape[0],  word_vecs.shape[0])
    new_vector[np.arange(new_vector.shape[0]), instance ] =1
new_instance.append(new_vector)
new_instance = np.array(new_instance)

这是一个热向量

的输出
(260, 4075)
(260, 4075)
(260, 4075)
(260, 4075)
(260, 4075)
(260, 4075)
(260, 4075)
(260, 4075)
(260, 4075)
(260, 4075)
(260, 4075)
(260, 4075)
(260, 4075)
(260, 4075)
(260, 4075)
[[[ 1.  0.  0. ...,  0.  0.  0.]
  [ 1.  0.  0. ...,  0.  0.  0.]
  [ 1.  0.  0. ...,  0.  0.  0.]
  ..., 
  [ 0.  0.  0. ...,  0.  0.  0.]
  [ 0.  0.  0. ...,  0.  0.  0.]
  [ 0.  0.  1. ...,  0.  0.  0.]]]

和最后

for epoch in range(number_of_epochs):    
        start_time = time.time()
        #Train for 1 epoch
        model.fit(train_x, new_instance, nb_epoch=1,  batch_size=batch_size, verbose=False, shuffle=True)   
        print("%.2f sec for training" % (time.time() - start_time))
        sys.stdout.flush()
我是新手,请原谅。谢谢你

经过一段时间后,我发现问题是在一个热向量编码代码中的错误缩进。此外,我减少了我的数据集大小,使其符合更快。

下面的

是更正后的代码

new_instance = []
for instance in train_y :
    new_vector = np.zeros(shape=(instance.shape[0],  word_vecs.shape[0]))
    print(instance.shape[0],  word_vecs.shape[0])
    new_vector[np.arange(new_vector.shape[0]), instance ] =1
    new_instance.append(new_vector)
new_instance = np.array(new_instance)

相关内容

  • 没有找到相关文章

最新更新