如何训练分类CNN?

我目前正在尝试训练一个模型来做鸟类物种识别。这个模型稍后将被转换并托管在Arduino nano 33 BLE上，靠近鸟类来吃的地方。

为了训练我的模型，我使用了kaggle API来使用包含250个物种的数据集，该数据集分为训练、验证和测试集。图片格式为。jpg 224x224 RGB。为了简化数据标记，我使用了Keras预处理工具，它允许我根据数据的文件夹标记数据，这工作得很好。

下面是预处理:


from tensorflow.keras.preprocessing.image import ImageDataGenerator

# All images will be augmented
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest')

# Flow training images in batches of 128 using train_datagen generator
train_generator = train_datagen.flow_from_directory(
'/content/train',  # This is the source directory for training images
target_size=(224, 224),  # All images will be resized to 150x150
batch_size=128,
class_mode='binary',
color_mode='rgb',
save_format='jpg')

validation_datagen = ImageDataGenerator(rescale=1/255)

validation_generator = validation_datagen.flow_from_directory(
'/content/valid',
target_size=(224, 224),
class_mode='categorical',
color_mode='rgb',
save_format='jpg')

然后我用卷积和maxpooling创建了一个keras模型来处理我的数据，然后我使用了2个隐藏层来使用softmax激活。下面是我的模型代码:


import tensorflow as tf

model = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(16, (3,3), activation='relu', input_shape=(224, 224, 3)),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(32, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dense(250, activation='softmax')
])

我面对的错误是:

InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-58-6a14ef1f8bcb> in <module>()
4       epochs=15,
5       verbose=1,
----> 6       validation_data=validation_generator)
6 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
58     ctx.ensure_initialized()
59     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
---> 60                                         inputs, attrs, num_outputs)
61   except core._NotOkStatusException as e:
62     if name is not None:
InvalidArgumentError:  Can not squeeze dim[1], expected a dimension of 1, got 250
[[node Squeeze (defined at <ipython-input-58-6a14ef1f8bcb>:6) ]] [Op:__inference_test_function_3788]
Function call stack:
test_function

项目的存储库:https://github.com/BaptisteZloch/Birds-species-spotting

我希望有人能帮我解决这个问题!

问候,巴普蒂斯特ZLOCH

我是您正在使用的数据集的创建者。你真的不需要太多的图像增强，因为有35215个训练图像，1250个测试图像(每个物种5个)和1250个验证图像(每个物种5个)。所以我最多只能使用horizontal_flip=True。其余的将贡献很少，并增加处理时间。这是一个超级干净的数据集，其中感兴趣的鸟类区域至少占图像像素的50%。在你的train_gen中，你应该有target_size=(128,128)和class_mode='categorical'。还有save_format='jpg'。如果不指定"save_to_dir"，则忽略该参数。很好，你没有指定它，因为当你训练的时候，你会用大量的图像填满那个目录。在您的模型中更改input_shape=(150,150,3)。在下面的代码中，我添加了两个回调early_stop和rlronp。第一个监控验证损失，如果连续4个epoch后损失没有减少，将停止训练。它保存了具有最小验证损失的epoch的权重模型。第二个监控验证损失，如果在一个历元结束时损失未能减少，则将学习率降低0.5倍。文档在这里。工作代码如下:

model.compile(Adam(lr=.001), loss='categorical_crossentropy', metrics=['accuracy']) 
train_dir=r'c:tempbirdstrain' # change this to point to your directory
valid_dir=r'c:tempbirdsvalid' # change this to point to your directory
test_dir=r'c:tempbirdstest'   # change this to point to your directory
train_gen=ImageDataGenerator(rescale=1/255, horizontal_flip=True).flow_from_directory( train_dir, target_size=(150, 150),
batch_size=32, seed=123,  class_mode='categorical', color_mode='rgb',shuffle=True) 
valid_gen=ImageDataGenerator(rescale=1/255).flow_from_directory( valid_dir, target_size=(150, 150),
batch_size=32, seed=123,  class_mode='categorical', color_mode='rgb',shuffle=False)
test_gen=ImageDataGenerator(rescale=1/255).flow_from_directory( test_dir, target_size=(150, 150),
batch_size=32, seed=123,  class_mode='categorical', color_mode='rgb',shuffle=False) 
early_stop=tf.keras.callbacks.EarlyStopping( monitor="val_loss", patience=4, verbose=1, restore_best_weights=True)
rlronp=tf.keras.callbacks.ReduceLROnPlateau(monitor="val_loss", factor=0.5,  patience=1, verbose=1)    
history=model.fit(x=train_gen,  epochs=30, verbose=1, callbacks=[early_stop, rlronp],  validation_data=valid_gen,
validation_steps=None,  shuffle=True)
performance=model.evaluate( test_gen, batch_size=32, verbose=1, steps=None, )[1] * 100
print('Model accuracy on test set is ', performance, ' %')

如果有250个类，你的模型将无法达到很高的精度值。课程越多，问题就越难。我会创建一个更复杂的模型，有更多的卷积层，也许还有一个额外的密集层。如果你添加一个额外的密集层，包括一个dropout层，以防止过度拟合。

相关内容

最新更新

热门标签：