当尝试从源代码构建TensorFlow时:不一致的CUDA工具包路径:/usr vs /usr/lib



在一个新的LambdaLabs GPU实例上,我安装了Bazel与Bazelisk:

wget https://github.com/bazelbuild/bazelisk/releases/download/v1.8.1/bazelisk-linux-amd64
chmod +x bazelisk-linux-amd64
sudo mv bazelisk-linux-amd64 /usr/local/bin/bazel

然后下载TF源码:

git clone https://github.com/tensorflow/tensorflow.git
cd tensorflow
git checkout r2.11

然后运行config:

./configure

结果如下:

ubuntu@*********:~/tensorflow$ ./configure
You have bazel 5.3.0 installed.
Please specify the location of python. [Default is /usr/bin/python3]: 

Found possible Python library paths:
  /usr/lib/python3/dist-packages
  /usr/local/lib/python3.8/dist-packages
Please input the desired Python library path to use.  Default is [/usr/lib/python3/dist-packages]
Do you wish to build TensorFlow with ROCm support? [y/N]: n
No ROCm support will be enabled for TensorFlow.
Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.
Do you wish to build TensorFlow with TensorRT support? [y/N]: y
TensorRT support will be enabled for TensorFlow.
Inconsistent CUDA toolkit path: /usr vs /usr/lib
Asking for detailed CUDA configuration...
Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 11]: 
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 2]: 
Please specify the TensorRT version you want to use. [Leave empty to default to TensorRT 6]: 
Please specify the locally installed NCCL version you want to use. [Leave empty to use http://github.com/nvidia/nccl]: 
Please specify the comma-separated list of base paths to look for CUDA libraries and headers. [Leave empty to use the default]: 
Inconsistent CUDA toolkit path: /usr vs /usr/lib
Asking for detailed CUDA configuration...

我不知道如何告诉编译器使用哪个CUDA工具包路径,甚至哪个是正确的路径。我只是想用TFRT支持重建TF。

感谢编辑:

当我检查CUDA的安装位置时:

locate cuda | grep /cuda$
/home/ubuntu/tensorflow/tensorflow/compiler/xla/stream_executor/cuda
/home/ubuntu/tensorflow/tensorflow/stream_executor/cuda
/home/ubuntu/tensorflow/third_party/gpus/cuda
/usr/include/cuda
/usr/include/thrust/system/cuda
/usr/lib/cuda
/usr/lib/python3/dist-packages/pycuda/cuda
/usr/lib/python3/dist-packages/tensorflow/include/tensorflow/stream_executor/cuda
/usr/lib/python3/dist-packages/theano/sandbox/cuda
/usr/lib/python3/dist-packages/torch/cuda
/usr/lib/python3/dist-packages/torch/backends/cuda
/usr/lib/python3/dist-packages/torch/include/ATen/cuda
/usr/lib/python3/dist-packages/torch/include/ATen/native/cuda
/usr/lib/python3/dist-packages/torch/include/c10/cuda
/usr/lib/python3/dist-packages/torch/include/torch/csrc/cuda
/usr/lib/python3/dist-packages/torch/include/torch/csrc/jit/cuda
/usr/lib/python3/dist-packages/torch/include/torch/csrc/jit/codegen/cuda
/usr/lib/python3/dist-packages/torch/include/torch/csrc/jit/codegen/fuser/cuda
/usr/share/doc/libthrust-dev/examples/cuda

显然正确的路径是/usr/lib/,但我不知道如何告诉编译器使用该路径。

你能运行这些命令吗?如果看到版本然后用Bazel构建看到任何错误吗?我的环境没有什么不同,我在Windows 10上运行,没有模拟,但作为警告消息,他们将不允许在下一个版本的GPU支持(实际上他们警告了很多次,但我仍然使用Windows 10的许多用户)

需求和期望的匹配,用户和应用程序在低级标准的水平上满足

  1. nvdisasm --version ( to see CUDA versions running )
    C:WINDOWSsystem32>nvdisasm --version
    nvdisasm: NVIDIA (R) CUDA disassembler
    Copyright (c) 2005-2021 NVIDIA Corporation
    Built on Sun_Aug_15_21:12:33_Pacific_Daylight_Time_2021
    Cuda compilation tools, release 11.4, V11.4.120
    Build cuda_11.4.r11.4/compiler.30300941_0
  1. nvcc -V ( to see CUDA drivers versions running )
    C:WINDOWSsystem32>nvcc -V
    nvcc: NVIDIA (R) Cuda compiler driver
    Copyright (c) 2005-2021 NVIDIA Corporation
    Built on Sun_Aug_15_21:18:57_Pacific_Daylight_Time_2021
    Cuda compilation tools, release 11.4, V11.4.120
    Build cuda_11.4.r11.4/compiler.30300941_0
  1. nvidia-smi ( to see the specification and support versions *11.6 )
    C:WINDOWSsystem32>nvidia-smi
    Tue Nov  8 00:34:17 2022
    +-----------------------------------------------------------------------------+
    | NVIDIA-SMI 512.15       Driver Version: 512.15       CUDA Version: 11.6     |
    |-------------------------------+----------------------+----------------------+
    | GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
    | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
    |                               |                      |               MIG M. |
    |===============================+======================+======================|
    |   0  NVIDIA GeForce ... WDDM  | 00000000:01:00.0  On |                  N/A |
    |  0%   45C    P8     9W / 120W |   1026MiB /  6144MiB |      2%      Default |
    |                               |                      |                  N/A |
    +-------------------------------+----------------------+----------------------+
  1. python ./configure.py
    C:Python310tensorflow>python ./configure.py
    You have bazel 6.0.0-pre.20221020.1 installed.
    Please specify the location of python. [Default is C:Python310python.exe]:
    Found possible Python library paths:
      C:Python310libsite-packages
      Python310object_detectionmodels
    Please input the desired Python library path to use.  Default is [C:Python310libsite-packages]
    Do you wish to build TensorFlow with ROCm support? [y/N]: n
    No ROCm support will be enabled for TensorFlow.
    
    
    WARNING: Cannot build with CUDA support on Windows.
    Starting in TF 2.11, CUDA build is not supported for Windows. For using TensorFlow GPU on Windows, you will need to build/install TensorFlow in WSL2.
    
    Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is /arch:AVX]:
    
    
    Would you like to override eigen strong inline for some C++ compilation to reduce the compilation time? [Y/n]: y
    Eigen strong inline overridden.
    
    Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n
    Not configuring the WORKSPACE for Android builds.
  1. bazel build //tensorflow/tools/pip_package:build_pip_package

相关内容

  • 没有找到相关文章

最新更新