在不同长度的序列批次上训练 LSTM 网络的最佳方法是什么?



所以我有一个序列到序列问题,其中输入是许多具有不同长度的多变量序列,输出是与其输入对应项长度相同的二进制向量序列。我将相同长度的序列分组到一个单独的文件夹中,并像这样调用 fit 函数:

for e in range(epochs):
print('Epoch', e+1)
for i in range(3,19):
train_x_batch,train_y_batch,batch_size= get_data(i)
history=model.fit_(train_x_batch,train_y_batch,
batch_size=batch_size,
validation_split=0.15,
callbacks=[tensorboard_cb])
def get_data(i):
train_x = np.load(os.path.join(cwd, "lab_values","batches",f"f_{i}","train_x.npy"), allow_pickle=True)
train_y = np.load(os.path.join(cwd, "lab_values","batches",f"f_{i}","train_y.npy"), allow_pickle=True)
print(f"batch no {i} Train X size= ", train_x.shape)
print(f"batch no {i} Train Y size= ", train_y.shape)
batch_Size=train_x.shape[0]
return train_x,train_y,batch_size

那么问题来了,有没有更好的方法呢?我听说我可以为此使用生成器,不幸的是我无法实现这样的生成器。

您正在尝试对整个数据(npy file)进行训练,而不是批量训练模型。

我们可以编写一个Generator并在Batches中训练模型。

我们使用代码从现有的 Numpy 文件中提取成批的数据,

train_x = np.load(os.path.join(cwd, "lab_values","batches",f"f_{i}","train_x.npy"), mmap_mode='r', allow_pickle=True)

x_batch = train_x[start:end].copy().

Generator的完整代码和Training的代码如下所示:

import numpy as np
for e in range(epochs):
    print('Epoch', e+1)
    for i in range(3,19):
        #train_x_batch,train_y_batch = get_data(i)
        batch_size = 32
        history=model.fit_(get_data(i),
                    batch_size=batch_size,
                    validation_split=0.15,
                    callbacks=[tensorboard_cb],epochs = 20
                          steps_per_epoch = 500, val_steps = 10)
def get_data(i):
    train_x = np.load(os.path.join(cwd, "lab_values","batches",f"f_{i}","train_x.npy"), 
                      mmap_mode='r', allow_pickle=True)
    train_y = np.load(os.path.join(cwd, "lab_values","batches",f"f_{i}","train_y.npy"),
                      mmap_mode='r', allow_pickle=True)
    print(f"batch no {i} Train X size= ", train_x.shape)
    print(f"batch no {i} Train Y size= ", train_y.shape)
    Number_Of_Rows = train_x.shape[0]
    batch_size = 32
    start = np.random.choice(Number_of_Rows - batch_size)
    end = start + batch_size
    x_batch = train_x[start:end].copy()
    y_batch = train_y[start:end].copy()        
    yield x_batch,y_batch

有关更多信息,请参阅此 SO 问题和此 SO 问题。

最新更新