Tensorflow:无法加载动态库 'libcusolver.so.11';dlerror: libcusolver.so.11: 无法打开共享对象文件: 没有这样的文件



我已经试着在gpu中运行tensorflow好几天了,但一直没能完成。

我知道有几个问题有类似的问题,但我已经尝试了我发现的所有问题,但都没有成功,所以这就是我写这个问题的原因:

如何安装libcusolver.so.11

https://stackoverflow.com/a/67642774/15098668

我已经为英伟达GeForce RTX 3090:安装了驱动程序460.106.00和cuda 11.2

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.106.00   Driver Version: 460.106.00   CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 3090    On   | 00000000:08:00.0  On |                  N/A |
| 33%   26C    P8    22W / 350W |    282MiB / 24260MiB |      2%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
         
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1264      G   /usr/lib/xorg/Xorg                 59MiB |
|    0   N/A  N/A      3349      G   /usr/lib/xorg/Xorg                124MiB |
|    0   N/A  N/A      3508      G   /usr/bin/gnome-shell               77MiB |
|    0   N/A  N/A      6384      G   /usr/lib/firefox/firefox            4MiB |
+-----------------------------------------------------------------------------+

大棒:

cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 8
#define CUDNN_MINOR 1
#define CUDNN_PATCHLEVEL 1

GCC编译器:

gcc --version
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0

我还向添加了LD_LIRARY_PATH/bashrc

# Nvidia cuda toolkit
export PATH=/usr/local/cuda-11.2/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-11.2/lib64${LD_LIBRARY_PATH+:${LD_LIBRARY_PATH}}
export CUDA_HOME=/usr/local/cuda

我试过几个tensorflow和tensorflow gpu版本,从2.4到2.7,但每个版本都失败了:

2022-01-24 21:28:43.206834: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory

2022-01-24 21:28:44.087779: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-01-24 21:28:44.087827: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcublas.so.11'; dlerror: libcublas.so.11: cannot open shared object file: No such file or directory
2022-01-24 21:28:44.087858: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcublasLt.so.11'; dlerror: libcublasLt.so.11: cannot open shared object file: No such file or directory
2022-01-24 21:28:44.087891: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcufft.so.10'; dlerror: libcufft.so.10: cannot open shared object file: No such file or directory
2022-01-24 21:28:44.087921: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcurand.so.10'; dlerror: libcurand.so.10: cannot open shared object file: No such file or directory
2022-01-24 21:28:44.087947: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusolver.so.11'; dlerror: libcusolver.so.11: cannot open shared object file: No such file or directory
2022-01-24 21:28:44.087975: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusparse.so.11'; dlerror: libcusparse.so.11: cannot open shared object file: No such file or directory

提前谢谢,我不知道还能尝试什么。。。

确保遵循tensorflow软件兼容性:https://www.tensorflow.org/install/source#gpu

更多详细信息请点击此处:https://stackoverflow.com/a/50622526

我在使用时遇到了这个问题

  • python==3.10
  • tensorflow==2.8.0
  • cuda==11.0
  • cudnn==8.0

通过将python和tensorflow分别降级为3.6和2.4.0来解决此问题。从而满足tensorflow兼容性。

在尝试了很多事情之后,我创建了一个新的conda环境并安装了tensorflow gpu,因为我不在乎TF版本:

conda install tensorflow-gpu -c anaconda

它安装了以下所有软件包:

package                    |            build
---------------------------|-----------------
_tflow_select-2.1.0        |              gpu           2 KB  anaconda
absl-py-0.10.0             |           py38_0         170 KB  anaconda
aiohttp-3.6.3              |   py38h7b6447c_0         622 KB  anaconda
astunparse-1.6.3           |             py_0          17 KB  anaconda
async-timeout-3.0.1        |           py38_0          12 KB  anaconda
attrs-20.2.0               |             py_0          41 KB  anaconda
blas-1.0                   |              mkl           6 KB  anaconda
blinker-1.4                |           py38_0          21 KB  anaconda
brotlipy-0.7.0             |py38h7b6447c_1000         349 KB  anaconda
c-ares-1.16.1              |       h7b6447c_0         112 KB  anaconda
ca-certificates-2020.10.14 |                0         128 KB  anaconda
cachetools-4.1.1           |             py_0          12 KB  anaconda
certifi-2020.6.20          |           py38_0         160 KB  anaconda
cffi-1.14.0                |   py38h2e261b9_0         228 KB  anaconda
chardet-3.0.4              |        py38_1003         170 KB  anaconda
click-7.1.2                |             py_0          67 KB  anaconda
cryptography-3.1.1         |   py38h1ba5d50_0         618 KB  anaconda
cudatoolkit-10.1.243       |       h6bb024c_0       513.2 MB  anaconda
cudnn-7.6.5                |       cuda10.1_0       250.6 MB  anaconda
cupti-10.1.168             |                0         1.7 MB  anaconda
gast-0.3.3                 |             py_0          14 KB  anaconda
google-auth-1.22.1         |             py_0          62 KB  anaconda
google-auth-oauthlib-0.4.1 |             py_2          21 KB  anaconda
google-pasta-0.2.0         |             py_0          44 KB  anaconda
grpcio-1.31.0              |   py38hf8bcb03_0         2.3 MB  anaconda
h5py-2.10.0                |   py38hd6299e0_1         1.1 MB  anaconda
hdf5-1.10.6                |       hb1b8bf9_0         4.8 MB  anaconda
idna-2.10                  |             py_0          56 KB  anaconda
importlib-metadata-2.0.0   |             py_1          35 KB  anaconda
intel-openmp-2020.2        |              254         947 KB  anaconda
keras-preprocessing-1.1.0  |             py_1          36 KB  anaconda
libgfortran-ng-7.3.0       |       hdf63c60_0         1.3 MB  anaconda
libprotobuf-3.13.0.1       |       hd408876_0         2.3 MB  anaconda
markdown-3.3.2             |           py38_0         123 KB  anaconda
mkl-2019.4                 |              243       204.1 MB  anaconda
mkl-service-2.3.0          |   py38he904b0f_0          68 KB  anaconda
mkl_fft-1.2.0              |   py38h23d657b_0         173 KB  anaconda
mkl_random-1.1.0           |   py38h962f231_0         398 KB  anaconda
multidict-4.7.6            |   py38h7b6447c_1          72 KB  anaconda
numpy-1.19.1               |   py38hbc911f0_0          20 KB  anaconda
numpy-base-1.19.1          |   py38hfa32c7d_0         5.3 MB  anaconda
oauthlib-3.1.0             |             py_0          88 KB  anaconda
openssl-1.1.1h             |       h7b6447c_0         3.8 MB  anaconda
opt_einsum-3.1.0           |             py_0          54 KB  anaconda
protobuf-3.13.0.1          |   py38he6710b0_1         702 KB  anaconda
pyasn1-0.4.8               |             py_0          58 KB  anaconda
pyasn1-modules-0.2.8       |             py_0          67 KB  anaconda
pycparser-2.20             |             py_2          94 KB  anaconda
pyjwt-1.7.1                |           py38_0          32 KB  anaconda
pyopenssl-19.1.0           |             py_1          47 KB  anaconda
pysocks-1.7.1              |           py38_0          27 KB  anaconda
requests-2.24.0            |             py_0          54 KB  anaconda
requests-oauthlib-1.3.0    |             py_0          22 KB  anaconda
rsa-4.6                    |             py_0          26 KB  anaconda
scipy-1.5.2                |   py38h0b6359f_0        18.7 MB  anaconda
six-1.15.0                 |             py_0          13 KB  anaconda
tensorboard-2.2.1          |     pyh532a8cf_0         2.5 MB  anaconda
tensorboard-plugin-wit-1.6.0|             py_0         663 KB  anaconda
tensorflow-2.2.0           |gpu_py38hb782248_0           4 KB  anaconda
tensorflow-base-2.2.0      |gpu_py38h83e3d50_0       421.3 MB  anaconda
tensorflow-estimator-2.2.0 |     pyh208ff02_0         276 KB  anaconda
tensorflow-gpu-2.2.0       |       h0d30ee6_0           2 KB  anaconda
termcolor-1.1.0            |           py38_1           8 KB  anaconda
urllib3-1.25.11            |             py_0          93 KB  anaconda
werkzeug-1.0.1             |             py_0         243 KB  anaconda
wrapt-1.12.1               |   py38h7b6447c_1          50 KB  anaconda
yarl-1.6.2                 |   py38h7b6447c_0         142 KB  anaconda
zipp-3.3.1                 |             py_0          11 KB  anaconda
------------------------------------------------------------
Total:        1.41 GB

包括cudatoolkit和cudnn。。。

之后,我不知道为什么,TF检测到英伟达卡:

2022-01-25 09:37:52.865587: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2022-01-25 09:37:52.902796: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-01-25 09:37:52.903487: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:08:00.0 name: GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.69GiB deviceMemoryBandwidth: 871.81GiB/s
2022-01-25 09:37:52.903637: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2022-01-25 09:37:52.904633: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2022-01-25 09:37:52.905878: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2022-01-25 09:37:52.906023: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2022-01-25 09:37:52.907115: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2022-01-25 09:37:52.907719: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2022-01-25 09:37:52.910042: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2022-01-25 09:37:52.910137: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-01-25 09:37:52.911078: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-01-25 09:37:52.911707: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
Num GPUs Available:  1
Prcess finished with exit code 0

最新更新