构建失败 - GPU 服务器中的 Tensorflow - 错误 - 找不到 @local_config_cuda//c



我正在使用基于 GPU 的服务器和 GTX-1080 和 Ubuntu 16.04LTS。 使用正常的张量流安装,我在运行应用程序时收到以下警告-

2017-08-01 14:49:57.232126: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-01 14:49:57.232157: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-01 14:49:57.232162: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-08-01 14:49:57.232165: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-01 14:49:57.232169: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.

当尝试构建最新的代码版本 -v1.3.0-rc1-531-gcd4c17e 时,本地构建失败,并显示以下错误消息

user@devbox:~/Workouts/tensorflow$ bazel build --config=opt --config=cuda --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" ./tensorflow/tools/pip_package:build_pip_package 
.......
ERROR: no such package '@local_config_cuda//crosstool': Traceback (most recent call last):
File "/home/user/Workouts/tensorflow/third_party/gpus/cuda_configure.bzl", line 1039
_create_local_cuda_repository(repository_ctx)
File "/home/user/Workouts/tensorflow/third_party/gpus/cuda_configure.bzl", line 976, in _create_local_cuda_repository
_host_compiler_includes(repository_ctx, cc)
File "/home/user/Workouts/tensorflow/third_party/gpus/cuda_configure.bzl", line 145, in _host_compiler_includes
get_cxx_inc_directories(repository_ctx, cc)
File "/home/user/Workouts/tensorflow/third_party/gpus/cuda_configure.bzl", line 120, in get_cxx_inc_directories
set(includes_cpp)
depsets cannot contain mutable items
INFO: Elapsed time: 5.488s
FAILED: Build did NOT complete successfully (3 packages loaded)

下面给出了有关平台和配置的一些其他详细信息

user@gpu-devbox:~/Workouts/tensorflow$ python --version
Python 2.7.12
user@gpu-devbox:~/Workouts/tensorflow$ gcc --version
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
user@gpu-devbox:~/Workouts/tensorflow$ bazel version
Build label: 0.5.3
Build target: bazel-out/local-fastbuild/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Fri Jul 28 08:34:59 2017 (1501230899)
Build timestamp: 1501230899
Build timestamp as int: 1501230899

配置准备如下

user@gpu-devbox:~/Workouts/tensorflow$ ./configure 
WARNING: Running Bazel server needs to be killed, because the startup options are different.
Please specify the location of python. [Default is /usr/bin/python]: 
Found possible Python library paths:
/usr/local/lib/python2.7/dist-packages
/usr/lib/python2.7/dist-packages
Please input the desired Python library path to use.  Default is /usr/local/lib/python2.7/dist-packages
Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: 
jemalloc as malloc support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N]: 
No Google Cloud Platform support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Hadoop File System support? [y/N]: 
No Hadoop File System support will be enabled for TensorFlow.
Do you wish to build TensorFlow with XLA JIT support? [y/N]: 
No XLA JIT support will be enabled for TensorFlow.
Do you wish to build TensorFlow with VERBS support? [y/N]: 
No VERBS support will be enabled for TensorFlow.
Do you wish to build TensorFlow with OpenCL support? [y/N]: 
No OpenCL support will be enabled for TensorFlow.
Do you wish to build TensorFlow with CUDA support? [y/N]: Y
CUDA support will be enabled for TensorFlow.
Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 8.0]: 
Please specify the location where CUDA 8.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 
"Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 6.0]: 
Please specify the location where cuDNN 6 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at:     https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 6.1,6.1,6.1,6.1]6.1
Do you want to use clang as CUDA compiler? [y/N]: 
nvcc will be used as CUDA compiler.
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: 
Do you wish to build TensorFlow with MPI support? [y/N]: 
No MPI support will be enabled for TensorFlow.
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]: 
Add "--config=mkl" to your bazel command to build with MKL support.
Please note that MKL on MacOS or windows is still not supported.
If you would like to use a local MKL instead of downloading, please set the environment variable "TF_MKL_ROOT" every time before build.
Configuration finished

一些额外的细节 -

user@gpu-devbox:~/Workouts/tensorflow$ echo $CUDA_HOME
/usr/local/cuda-8.0
user@gpu-devbox:~/Workouts/tensorflow$ echo $LD_LIBRARY_PATH
/usr/local/cuda-8.0/lib64

我错过了什么吗?

尝试将其添加到构建命令中

--crosstool_top=@local_config_cuda//crosstool:toolchain

这是我们所有用于TF + TF服务的Dockerfile,包括CPU,GPU,AVX等。

https://github.com/fluxcapacitor/pipeline/tree/master/package.ml/tensorflow/16d39e9-d690fdd

命名约定是 Tensorflow-[tf_git_hash]-[tf_serving_git_hash]。 这些 Dockerfile 与几天前一样最新。

另一个很好的资源是TensorFlow Jenkins/CI页面: http://ci.tensorflow.org/

它们的构建是大量参数化的,因此它们适应您自己的环境有点棘手,但绝对帮助我们达到了上面提到的 Dockerfile。

相关内容

最新更新