仅在本地GPU上CNN精度低


由于某种原因,我所有的卷积神经网络的精度都很差。而不管编译的模型如何。这是在jupyter笔记本电脑上的本地机器使用RTX 3060 TI GPU与CUDA 11.1。

当我使用谷歌Colab时,我所有的代码都能以高精度工作。应该注意的是,这仅适用于卷积神经网络。只有密集连接层的神经网络工作良好。

一些细节:

Tensor Flow Version: 2.1.0
Keras Version: 2.2.4-tf
Python 3.7.9 (default, Aug 31 2020, 17:10:11) [MSC v.1916 64 bit (AMD64)]
Pandas 1.2.0
Scikit-Learn 0.24.0
GPU is available

这是一个示例代码(二进制分类50/50分割(:

from tensorflow.keras import layers
from tensorflow.keras import models
from tensorflow.keras import optimizers
from tensorflow.keras.preprocessing.image import ImageDataGenerator
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu',input_shape=(150, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer = optimizers.RMSprop(lr=1e-6), #decrease learning rate
metrics=['accuracy'])
train_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(train_dir,target_size=(150, 150),batch_size=20,
class_mode='binary')
validation_generator = test_datagen.flow_from_directory(validation_dir,target_size=(150, 150),
batch_size=20, class_mode='binary')
history = model.fit_generator(
train_generator,
steps_per_epoch=100,
epochs=30,
validation_data=validation_generator,
validation_steps=50)

结果

WARNING:tensorflow:From <ipython-input-8-f61a1535c537>:6: Model.fit_generator (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version.
Instructions for updating:
Please use Model.fit, which supports generators.
WARNING:tensorflow:sample_weight modes were coerced from
...
to  
['...']
WARNING:tensorflow:sample_weight modes were coerced from
...
to  
['...']
Train for 100 steps, validate for 50 steps
Epoch 1/30
100/100 [==============================] - 1158s 12s/step - loss: 0.7020 - accuracy: 0.4945 - val_loss: 0.8541 - val_accuracy: 0.4980
Epoch 2/30
100/100 [==============================] - 5s 47ms/step - loss: 0.6987 - accuracy: 0.5105 - val_loss: 0.6930 - val_accuracy: 0.5000 2s - loss: 0.6931 - accura - ETA: 2s - loss: 0.6931 - accura - ETA: 1s - loss: 0.6931  - ETA: 1s - loss: 0.6926 - accuracy: 0. - ETA: 1s - loss: 0.6939 - accuracy - ETA: 0s - los
Epoch 3/30
100/100 [==============================] - 5s 47ms/step - loss: 0.7000 - accuracy: 0.4985 - val_loss: 0.8449 - val_accuracy: 0.5000s - loss: 0.6983 - accuracy - ETA: 0s - loss:
Epoch 4/30
100/100 [==============================] - 5s 47ms/step - loss: 0.6967 - accuracy: 0.4975 - val_loss: 0.7162 - val_accuracy: 0.4800
Epoch 5/30
100/100 [==============================] - 5s 47ms/step - loss: 0.6931 - accuracy: 0.4945 - val_loss: 0.8477 - val_accuracy: 0.49900.6931 - accura - ETA: 0s - loss: 0.6931 - ac - ETA: 0s - loss: 0.6931 - 
Epoch 6/30
100/100 [==============================] - 5s 47ms/step - loss: 0.6931 - accuracy: 0.4895 - val_loss: 0.7846 - val_accuracy: 0.5000: 0.6931 - ac
Epoch 7/30
100/100 [==============================] - 5s 47ms/step - loss: 0.6933 - accuracy: 0.4860 - val_loss: 0.7468 - val_accuracy: 0.5000- ETA: 1s - loss: 0.6 - ETA: 0s - loss: 0.6938 

我不使用Colab,所以不确定为什么它在那里训练得很好,也许它使用了不同版本的tensorflow。在model.fit中,您可以设置steps_per_epoch和validation_steps的值。我发现最好把这些当作没有。Model.fit将自动确定正确的值。此外,model.fit_generator正在折旧,因此请改用model.fit。你的学习率很低,我想试试0.001之类的。我在两个类的图像数据集上运行了代码,模型确实进行了训练,但收敛速度较慢,lr=1e-6。在学习率为.001的情况下,它收敛得更快,在3个时期内获得与在小学习率的30个时期内相同的准确率(约81%(。我使用的是tensorflow 2.1,在让您的代码处理上述异常时没有问题。我认为问题可能是您正在使用CUDA 11.1。我认为tensorflow 2.1应该使用10.1。你应该有安装的库达工具包10.1.243和库登7.65

由于某些原因,运行此代码可以修复所有问题。有人能解释一下原因吗?感谢

import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
try:
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
except RuntimeError as e:
print(e)

相关内容

  • 没有找到相关文章

最新更新