在 Keras 中循环 model.fit 是否合乎逻辑?

为了不耗尽内存，在 Keras 中执行以下操作是否合乎逻辑？

for path in ['xaa', 'xab', 'xac', 'xad']:
x_train, y_train = prepare_data(path)
model.fit(x_train, y_train, batch_size=50, epochs=20, shuffle=True)
model.save('model')

是的，但如果每次迭代都生成一个批处理，则首选model.train_on_batch。这消除了fit带来的一些开销。

您也可以尝试创建一个生成器并使用model.fit_generator()：

def dataGenerator(pathes, batch_size):
while True: #generators for keras must be infinite
for path in pathes:
x_train, y_train = prepare_data(path)
totalSamps = x_train.shape[0]
batches = totalSamps // batch_size
if totalSamps % batch_size > 0:
batches+=1
for batch in range(batches):
section = slice(batch*batch_size,(batch+1)*batch_size)
yield (x_train[section], y_train[section])

创建和使用：

gen = dataGenerator(['xaa', 'xab', 'xac', 'xad'], 50)
model.fit_generator(gen,
steps_per_epoch = expectedTotalNumberOfYieldsForOneEpoch
epochs = epochs)

我建议在Github上看看这个线程。

您确实可以考虑使用model.fit()，但以这种方式进行会使训练更加稳定：

for epoch in range(20):
for path in ['xaa', 'xab', 'xac', 'xad']:
x_train, y_train = prepare_data(path)
model.fit(x_train, y_train, batch_size=50, epochs=epoch+1, initial_epoch=epoch, shuffle=True)

这样，您在每个纪元迭代一次所有数据，而不是在切换之前对部分数据迭代 20 个纪元。

如线程中所述，另一种解决方案是开发自己的数据生成器并将其与model.fit_generator()一起使用。

相关内容

最新更新

热门标签：