input_shape语句中Keras Conv1D层的批大小



My data (after shape):

  • X_train= numpy_array, shape: (21000, 2297, 1)
  • X_val= numpy_array, shape: (9000, 2297, 1)

两个数组都包含时间序列。由于填充,所有时间序列的长度为2297。


我的模型:

model = keras.Sequential()
model.add(Conv1D(32, 2, activation='relu', input_shape=(2297, 1))) # input_shape = (n_columns, 1)
model.add(Dropout(0.2))
model.add(Conv1D(256, 2, activation='relu'))
model.add(Dropout(0.2))
model.add(Conv1D(32, 2, activation='relu'))
model.add(Flatten())
model.add(Dense(1))
model.compile(optimizer=Adam(learning_rate = 0.001), loss = 'mse', metrics = ['mae', 'mse']) 
model.summary()
history = model.fit(X_train, y_train, epochs=10, validation_data=(X_val, y_val), verbose=1)

我的问题:

如果我保留上面的input_shape()语句,模型运行良好,但需要很长时间才能训练。我猜,这是因为不传递批处理大小使模型使用完整批处理。是这样吗?

我想传递一个批处理大小,以便使网络在每一步中只训练一小部分数据。根据这篇文章和这篇文章,我的数据条目的正确顺序是:

input_shape=(batch size, time steps, 1)

但是,假设我想要的批大小= 1000。然后,我的第一个Conv1D层看起来如下(模型的其余部分仍然如上所述):

model = keras.Sequential()
model.add(Conv1D(32, 2, activation='relu', input_shape=(1000, 2297, 1))) # input_shape = (batch_size, n_columns, 1) ...

并引发以下错误:

ValueError: Input 0 of layer conv1d_3 is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 1000, 2297, 1]

为什么是这样,我如何正确地传递批大小?

在第一个版本的代码中,你写得正确。

如何正确传递批大小?我们不需要将batch_size作为input_shape传递给我们的模型。我们可以在model.fit(..., batch_size=1000)中设置batch_size

如果你的模型需要很长时间来训练:

  1. 确保在GPU上训练你的模型。
  2. 您可以使用较小的filter_size。(在第二个Conv1d层中使用filter=256我只使用filter = 16或32)
  3. 你可以使用更大的步幅。(在Conv1D中,strides=1为默认值。我使用strides = 2)
  4. 您可以使用较小的kernel_size。(你使用kernel_size=2,它是OK的。)

完整代码:(10次训练时间->10秒)

import tensorflow as tf
import numpy as np
X_train = np.random.rand(21000,2297,1)
y_train = np.random.randint(0,2,21000)
X_val = np.random.rand(9000,2297,1)
y_val = np.random.randint(0,2,9000)
model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv1D(filters = 16, 
kernel_size = 2, 
strides = 2,
activation='relu', input_shape=(2297, 1)))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Conv1D(filters = 32, 
kernel_size = 2, 
strides = 2,
activation='relu'))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Conv1D(16, 2, activation='relu'))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(1))
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001), 
loss = 'mse', metrics = ['mae', 'mse']) 
model.summary()
history = model.fit(X_train, y_train, 
batch_size=128, 
epochs=10, 
validation_data=(X_val, y_val), 
verbose=1)

输出:

Model: "sequential_8"
_________________________________________________________________
Layer (type)                Output Shape              Param #   
=================================================================
conv1d_21 (Conv1D)          (None, 1148, 16)          48        

dropout_12 (Dropout)        (None, 1148, 16)          0         

conv1d_22 (Conv1D)          (None, 574, 32)           1056      

dropout_13 (Dropout)        (None, 574, 32)           0         

conv1d_23 (Conv1D)          (None, 573, 16)           1040      

flatten_6 (Flatten)         (None, 9168)              0         

dense_6 (Dense)             (None, 1)                 9169      

=================================================================
Total params: 11,313
Trainable params: 11,313
Non-trainable params: 0
_________________________________________________________________
Epoch 1/10
165/165 [==============================] - 3s 13ms/step - loss: 0.2608 - mae: 0.5008 - mse: 0.2608 - val_loss: 0.2747 - val_mae: 0.4993 - val_mse: 0.2747
Epoch 2/10
165/165 [==============================] - 2s 13ms/step - loss: 0.2521 - mae: 0.4991 - mse: 0.2521 - val_loss: 0.2865 - val_mae: 0.4993 - val_mse: 0.2865
Epoch 3/10
165/165 [==============================] - 1s 9ms/step - loss: 0.2499 - mae: 0.4969 - mse: 0.2499 - val_loss: 0.2988 - val_mae: 0.4991 - val_mse: 0.2988
Epoch 4/10
165/165 [==============================] - 1s 9ms/step - loss: 0.2484 - mae: 0.4952 - mse: 0.2484 - val_loss: 0.2850 - val_mae: 0.4993 - val_mse: 0.2850
Epoch 5/10
165/165 [==============================] - 1s 9ms/step - loss: 0.2481 - mae: 0.4926 - mse: 0.2481 - val_loss: 0.2650 - val_mae: 0.5001 - val_mse: 0.2650
Epoch 6/10
165/165 [==============================] - 2s 9ms/step - loss: 0.2457 - mae: 0.4899 - mse: 0.2457 - val_loss: 0.2824 - val_mae: 0.4998 - val_mse: 0.2824
Epoch 7/10
165/165 [==============================] - 1s 9ms/step - loss: 0.2432 - mae: 0.4856 - mse: 0.2432 - val_loss: 0.2591 - val_mae: 0.5005 - val_mse: 0.2591
Epoch 8/10
165/165 [==============================] - 1s 9ms/step - loss: 0.2426 - mae: 0.4824 - mse: 0.2426 - val_loss: 0.2649 - val_mae: 0.5009 - val_mse: 0.2649
Epoch 9/10
165/165 [==============================] - 2s 10ms/step - loss: 0.2392 - mae: 0.4781 - mse: 0.2392 - val_loss: 0.2693 - val_mae: 0.5009 - val_mse: 0.2693
Epoch 10/10
165/165 [==============================] - 1s 9ms/step - loss: 0.2366 - mae: 0.4733 - mse: 0.2366 - val_loss: 0.2688 - val_mae: 0.5012 - val_mse: 0.2688

注意:

  1. 我使用随机数作为输入
  2. Samples = 21000,batch_size=128->training_sample for each epoch =21000/128 = 164.06 ~= 165

相关内容

  • 没有找到相关文章