为什么Keras看不到我的GPU，而TensorFlow看到了

根据SO的回答，我运行了：

# confirm TensorFlow sees the GPU
from tensorflow.python.client import device_lib
assert 'GPU' in str(device_lib.list_local_devices())
# confirm Keras sees the GPU
from keras import backend
assert len(backend.tensorflow_backend._get_available_gpus()) > 0
# confirm PyTorch sees the GPU
from torch import cuda
assert cuda.is_available()
assert cuda.device_count() > 0
print(cuda.get_device_name(cuda.current_device()))

第一个测试有效，而其他测试则无效。

运行nvcc --version给出：

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176

nvidia smi也起作用。

list_local_devices()提供：

[名称："/device:CPU:0"设备类型："CPU"内存限制：268435456位置｛｝化身：459307207819325532，名称："/device:XLA_GPU:0"设备类型："XLA_GPU"内存限制：17179869184位置{}化身：9054555249843627113 physical_device_desc:"设备：XLA_GPU设备"，名称："/device:XLA_CPU:0"设备类型："XLA_CPU"内存限制：17179869184位置｛｝化身：5902450771458744885 physical_device_desc:"设备：XLA_CPU设备"]

sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))退货：

设备映射：/作业：本地主机/副本：0/任务：0/设备：XLA_GPU:0->设备：XLA-GPU设备/作业：本地主机/副本：0/任务：0/设备：XLA_CPU:0->设备：XLA-CPU设备

为什么Keras和PyTorch无法在我的GPU上运行？(RTX 2070(

我很难找到这个问题。事实上，运行CUDA样本为我提供了很好的见解：

CUDA error at ../../common/inc/helper_cuda.h:1162 code=30(cudaErrorUnknown) "cudaGetDeviceCount(&device_count)"

使用sudo时：MapSMtoCores for SM 7.5 is undefined. Default to use 64 Cores/SM GPU Device 0: "GeForce RTX 2070" with compute capability 7.5

所以问题是我的库不是每个人都能阅读的。

我的错误被修复了：

sudo chmod -R a+r /usr/local/cuda*

我最近遇到了这个问题。事实证明，pip安装的必备包(如keras(不包括XLA相关标志。如果我改为安装完整的miniconda或anaconda必需的软件包，那么我就可以运行我的代码了。在我的情况下，我正在运行脸书的人工智能代码。

存在问题的早期指标正在运行：

nvidia-smi

看到你的deepnet没有使用千兆比特的数据，而是使用千字节。然后，即使没有警告(有时很难在日志中找到(，你也知道问题在于如何编译必要的软件。你知道这一点是因为GPU在设备类型上不匹配，因此默认为CPU。然后将代码卸载到CPU上。

在我的案例中，我使用miniconda安装了tensorflow gpu、ipython、imutils、imgaug和其他一些软件包。如果您发现conda中缺少必需的包，请使用：

conda -c conda-forge <package-name>

以拾取丢失的项目，如imutils和imgaug。

相关内容

最新更新

热门标签：