My data (after shape):
X_train
= numpy_array, shape: (21000, 2297, 1)X_val
= numpy_array, shape: (9000, 2297, 1)
两个数组都包含时间序列。由于填充,所有时间序列的长度为2297。
我的模型:
model = keras.Sequential()
model.add(Conv1D(32, 2, activation='relu', input_shape=(2297, 1))) # input_shape = (n_columns, 1)
model.add(Dropout(0.2))
model.add(Conv1D(256, 2, activation='relu'))
model.add(Dropout(0.2))
model.add(Conv1D(32, 2, activation='relu'))
model.add(Flatten())
model.add(Dense(1))
model.compile(optimizer=Adam(learning_rate = 0.001), loss = 'mse', metrics = ['mae', 'mse'])
model.summary()
history = model.fit(X_train, y_train, epochs=10, validation_data=(X_val, y_val), verbose=1)
我的问题:
如果我保留上面的input_shape()
语句,模型运行良好,但需要很长时间才能训练。我猜,这是因为不传递批处理大小使模型使用完整批处理。是这样吗?
我想传递一个批处理大小,以便使网络在每一步中只训练一小部分数据。根据这篇文章和这篇文章,我的数据条目的正确顺序是:
input_shape=(batch size, time steps, 1)
但是,假设我想要的批大小= 1000。然后,我的第一个Conv1D层看起来如下(模型的其余部分仍然如上所述):
model = keras.Sequential()
model.add(Conv1D(32, 2, activation='relu', input_shape=(1000, 2297, 1))) # input_shape = (batch_size, n_columns, 1) ...
并引发以下错误:
ValueError: Input 0 of layer conv1d_3 is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 1000, 2297, 1]
为什么是这样,我如何正确地传递批大小?
在第一个版本的代码中,你写得正确。
如何正确传递批大小?我们不需要将batch_size
作为input_shape
传递给我们的模型。我们可以在model.fit(..., batch_size=1000)
中设置batch_size
。
如果你的模型需要很长时间来训练:
- 确保在
GPU
上训练你的模型。 - 您可以使用较小的filter_size。(在第二个Conv1d层中使用
filter=256
。我只使用filter = 16或32) - 你可以使用更大的步幅。(在Conv1D中,
strides=1
为默认值。我使用strides = 2) - 您可以使用较小的kernel_size。(你使用
kernel_size=2
,它是OK的。)
完整代码:(10次训练时间->10秒)
import tensorflow as tf
import numpy as np
X_train = np.random.rand(21000,2297,1)
y_train = np.random.randint(0,2,21000)
X_val = np.random.rand(9000,2297,1)
y_val = np.random.randint(0,2,9000)
model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv1D(filters = 16,
kernel_size = 2,
strides = 2,
activation='relu', input_shape=(2297, 1)))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Conv1D(filters = 32,
kernel_size = 2,
strides = 2,
activation='relu'))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Conv1D(16, 2, activation='relu'))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(1))
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
loss = 'mse', metrics = ['mae', 'mse'])
model.summary()
history = model.fit(X_train, y_train,
batch_size=128,
epochs=10,
validation_data=(X_val, y_val),
verbose=1)
输出:
Model: "sequential_8"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv1d_21 (Conv1D) (None, 1148, 16) 48
dropout_12 (Dropout) (None, 1148, 16) 0
conv1d_22 (Conv1D) (None, 574, 32) 1056
dropout_13 (Dropout) (None, 574, 32) 0
conv1d_23 (Conv1D) (None, 573, 16) 1040
flatten_6 (Flatten) (None, 9168) 0
dense_6 (Dense) (None, 1) 9169
=================================================================
Total params: 11,313
Trainable params: 11,313
Non-trainable params: 0
_________________________________________________________________
Epoch 1/10
165/165 [==============================] - 3s 13ms/step - loss: 0.2608 - mae: 0.5008 - mse: 0.2608 - val_loss: 0.2747 - val_mae: 0.4993 - val_mse: 0.2747
Epoch 2/10
165/165 [==============================] - 2s 13ms/step - loss: 0.2521 - mae: 0.4991 - mse: 0.2521 - val_loss: 0.2865 - val_mae: 0.4993 - val_mse: 0.2865
Epoch 3/10
165/165 [==============================] - 1s 9ms/step - loss: 0.2499 - mae: 0.4969 - mse: 0.2499 - val_loss: 0.2988 - val_mae: 0.4991 - val_mse: 0.2988
Epoch 4/10
165/165 [==============================] - 1s 9ms/step - loss: 0.2484 - mae: 0.4952 - mse: 0.2484 - val_loss: 0.2850 - val_mae: 0.4993 - val_mse: 0.2850
Epoch 5/10
165/165 [==============================] - 1s 9ms/step - loss: 0.2481 - mae: 0.4926 - mse: 0.2481 - val_loss: 0.2650 - val_mae: 0.5001 - val_mse: 0.2650
Epoch 6/10
165/165 [==============================] - 2s 9ms/step - loss: 0.2457 - mae: 0.4899 - mse: 0.2457 - val_loss: 0.2824 - val_mae: 0.4998 - val_mse: 0.2824
Epoch 7/10
165/165 [==============================] - 1s 9ms/step - loss: 0.2432 - mae: 0.4856 - mse: 0.2432 - val_loss: 0.2591 - val_mae: 0.5005 - val_mse: 0.2591
Epoch 8/10
165/165 [==============================] - 1s 9ms/step - loss: 0.2426 - mae: 0.4824 - mse: 0.2426 - val_loss: 0.2649 - val_mae: 0.5009 - val_mse: 0.2649
Epoch 9/10
165/165 [==============================] - 2s 10ms/step - loss: 0.2392 - mae: 0.4781 - mse: 0.2392 - val_loss: 0.2693 - val_mae: 0.5009 - val_mse: 0.2693
Epoch 10/10
165/165 [==============================] - 1s 9ms/step - loss: 0.2366 - mae: 0.4733 - mse: 0.2366 - val_loss: 0.2688 - val_mae: 0.5012 - val_mse: 0.2688
注意:
- 我使用随机数作为输入
Samples = 21000
,batch_size=128
->training_sample for each epoch =21000/128 = 164.06 ~= 165