获取卷积算法失败.这可能是因为cuDNN初始化失败,所以尝试查看是否打印了警告日志消息



我在谷歌实验室做一个人脸识别项目。当我尝试执行以下代码

H = model.fit(
aug.flow(trainX, trainY, batch_size=BS),
steps_per_epoch=len(trainX) // BS,
validation_data=(testX, testY),
validation_steps=len(testX) // BS,
epochs=EPOCHS)

它给了我这个错误

Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[node model/Conv1/Conv2D
(defined at /usr/local/lib/python3.7/dist-packages/keras/layers/convolutional.py:238)
]] [Op:__inference_train_function_7525]
Errors may have originated from an input operation.
Input Source operations connected to node model/Conv1/Conv2D:
In[0] IteratorGetNext (defined at /usr/local/lib/python3.7/dist-packages/keras/engine/training.py:866)  
In[1] model/Conv1/Conv2D/ReadVariableOp:

有更多的错误,它继续…我确实尝试过重新启动运行时,这个问题的大多数解决方案都是在本地机器上。如果有人知道解决方法,请帮助我。

tensorflow版本2.7.0CUDA Version: 11.2

你可以通过调用tf.config.experimental.set_memory_growth来打开内存增长,它只尝试分配运行时分配所需的GPU内存:它开始分配很少的内存,然后随着模型的训练和需要更多的GPU内存,GPU内存被扩展。要打开特定GPU的内存增长,请在分配任何张量或执行任何操作之前使用以下代码。

def solve_cudnn_error():
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
try:
# Currently, memory growth needs to be the same across GPUs
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
logical_gpus = tf.config.experimental.list_logical_devices('GPU')
print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
except RuntimeError as e:
# Memory growth must be set before GPUs have been initialized
print(e)