以下是我的规格:
- GTX 1070
- 驱动程序367(从.run安装(
- Ubuntu 16.04
- CUDA 8.0(从.run安装(
- Cudnn 5
- Bazel 0.3.0(潜在问题?(
- gcc 4.9.3
- Tensorflow从源安装
验证版本:
volcart@volcart-Precision-Tower-7910:~/$ nvidia-smi
Fri Aug 5 15:03:32 2016
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.35 Driver Version: 367.35 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1070 Off | 0000:03:00.0 On | N/A |
| 0% 38C P8 11W / 185W | 495MiB / 8113MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 20303 G /usr/lib/xorg/Xorg 280MiB |
| 0 20909 G compiz 114MiB |
| 0 21562 G ...s-passed-by-fd --v8-snapshot-passed-by-fd 98MiB |
+-----------------------------------------------------------------------------+
volcart@volcart-Precision-Tower-7910:~/$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Wed_May__4_21:01:56_CDT_2016
Cuda compilation tools, release 8.0, V8.0.26
volcart@volcart-Precision-Tower-7910:~/$ bazel version
Build label: 0.3.0
Build target: bazel-out/local-fastbuild/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Fri Jun 10 11:38:23 2016 (1465558703)
Build timestamp: 1465558703
Build timestamp as int: 1465558703
volcart@volcart-Precision-Tower-7910:~/$ gcc -vUsing built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.9/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.9.3-13ubuntu2' --with-bugurl=file:///usr/share/doc/gcc-4.9/README.Bugs --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.9 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.9 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-4.9-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-4.9-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-4.9-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.9.3 (Ubuntu 4.9.3-13ubuntu2)
我确实切换了bazel版本,所以我成功地执行了bazel clean
。
我可以通过~/NVIDIA_CUDA-8.0_Samples/1_Utilities/deviceQuery$
验证CUDA的功能
volcart@volcart-Precision-Tower-7910:~/NVIDIA_CUDA-8.0_Samples/1_Utilities/deviceQuery$ ./deviceQuery
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "GeForce GTX 1070"
CUDA Driver Version / Runtime Version 8.0 / 8.0
CUDA Capability Major/Minor version number: 6.1
Total amount of global memory: 8113 MBytes (8507162624 bytes)
(15) Multiprocessors, (128) CUDA Cores/MP: 1920 CUDA Cores
GPU Max Clock rate: 1797 MHz (1.80 GHz)
Memory Clock rate: 4004 Mhz
Memory Bus Width: 256-bit
L2 Cache Size: 2097152 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 3 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = GeForce GTX 1070
Result = PASS
当我./configure
时,我输入所有默认值。
当前错误
当我构建训练示例时,我得到的是:
volcart@volcart-Precision-Tower-7910:/usr/local/lib/python2.7/dist-packages/tensorflow$ sudo bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer
Sending SIGTERM to previous Bazel server (pid=7108)... done.
.
INFO: Found 1 target...
...
./tensorflow/core/platform/default/logging.h: In instantiation of 'std::string* tensorflow::internal::Check_LTImpl(const T1&, const T2&, const char*) [with T1 = int; T2 = long unsigned int; std::string = std::basic_string<char>]':
tensorflow/core/common_runtime/gpu/gpu_device.cc:567:5: required from here
./tensorflow/core/platform/default/logging.h:197:35: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
TF_DEFINE_CHECK_OP_IMPL(Check_LT, < )
^
./tensorflow/core/platform/macros.h:54:29: note: in definition of macro 'TF_PREDICT_TRUE'
#define TF_PREDICT_TRUE(x) (x)
^
./tensorflow/core/platform/default/logging.h:197:1: note: in expansion of macro 'TF_DEFINE_CHECK_OP_IMPL'
TF_DEFINE_CHECK_OP_IMPL(Check_LT, < )
^
ERROR: /usr/local/lib/python2.7/dist-packages/tensorflow/tensorflow/cc/BUILD:199:1: Linking of rule '//tensorflow/cc:tutorials_example_trainer' failed: crosstool_wrapper_driver_is_not_gcc failed: error executing command third_party/gpus/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -o bazel-out/local_linux-opt/bin/tensorflow/cc/tutorials_example_trainer ... (remaining 805 argument(s) skipped): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1.
bazel-out/local_linux-opt/bin/tensorflow/cc/_objs/tutorials_example_trainer/tensorflow/cc/tutorials/example_trainer.o: In function `tensorflow::example::ConcurrentSteps(tensorflow::example::Options const*, int)':
example_trainer.cc:(.text._ZN10tensorflow7example15ConcurrentStepsEPKNS0_7OptionsEi+0x517): undefined reference to `google::protobuf::internal::empty_string_'
bazel-out/local_linux-opt/bin/tensorflow/core/kernels/libidentity_reader_op.lo(identity_reader_op.o): In function `tensorflow::IdentityReader::SerializeStateLocked(std::string*)':
identity_reader_op.cc:(.text._ZN10tensorflow14IdentityReader20SerializeStateLockedEPSs[_ZN10tensorflow14IdentityReader20SerializeStateLockedEPSs]+0x36): undefined reference to `google::protobuf::MessageLite::SerializeToString(std::string*) const'
bazel-out/local_linux-opt/bin/tensorflow/core/kernels/libwhole_file_read_ops.lo(whole_file_read_ops.o): In function `tensorflow::WholeFileReader::SerializeStateLocked(std::string*)':
当我尝试构建pip
包时,我得到的是:
volcart@volcart-Precision-Tower-7910:/usr/local/lib/python2.7/dist-packages/tensorflow$ sudo bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
WARNING: /usr/local/lib/python2.7/dist-packages/tensorflow/util/python/BUILD:11:16: in includes attribute of cc_library rule //util/python:python_headers: 'python_include' resolves to 'util/python/python_include' not in 'third_party'. This will be an error in the future.
WARNING: /home/volcart/.cache/bazel/_bazel_root/109ad80a732aaece8a87d1e3693889e7/external/gemmlowp/BUILD:102:12: in hdrs attribute of cc_library rule @gemmlowp//:eight_bit_int_gemm: Artifact 'external/gemmlowp/public/bit_depth.h' is duplicated (through '@gemmlowp//:eight_bit_int_gemm_public_headers' and '@gemmlowp//:gemmlowp_headers').
WARNING: /home/volcart/.cache/bazel/_bazel_root/109ad80a732aaece8a87d1e3693889e7/external/gemmlowp/BUILD:102:12: in hdrs attribute of cc_library rule @gemmlowp//:eight_bit_int_gemm: Artifact 'external/gemmlowp/public/gemmlowp.h' is duplicated (through '@gemmlowp//:eight_bit_int_gemm_public_headers' and '@gemmlowp//:gemmlowp_headers').
WARNING: /home/volcart/.cache/bazel/_bazel_root/109ad80a732aaece8a87d1e3693889e7/external/gemmlowp/BUILD:102:12: in hdrs attribute of cc_library rule @gemmlowp//:eight_bit_int_gemm: Artifact 'external/gemmlowp/public/map.h' is duplicated (through '@gemmlowp//:eight_bit_int_gemm_public_headers' and '@gemmlowp//:gemmlowp_headers').
WARNING: /home/volcart/.cache/bazel/_bazel_root/109ad80a732aaece8a87d1e3693889e7/external/gemmlowp/BUILD:102:12: in hdrs attribute of cc_library rule @gemmlowp//:eight_bit_int_gemm: Artifact 'external/gemmlowp/public/output_stages.h' is duplicated (through '@gemmlowp//:eight_bit_int_gemm_public_headers' and '@gemmlowp//:gemmlowp_headers').
WARNING: /home/volcart/.cache/bazel/_bazel_root/109ad80a732aaece8a87d1e3693889e7/external/gemmlowp/BUILD:102:12: in hdrs attribute of cc_library rule @gemmlowp//:eight_bit_int_gemm: Artifact 'external/gemmlowp/profiling/instrumentation.h' is duplicated (through '@gemmlowp//:eight_bit_int_gemm_public_headers' and '@gemmlowp//:gemmlowp_headers').
WARNING: /home/volcart/.cache/bazel/_bazel_root/109ad80a732aaece8a87d1e3693889e7/external/gemmlowp/BUILD:102:12: in hdrs attribute of cc_library rule @gemmlowp//:eight_bit_int_gemm: Artifact 'external/gemmlowp/profiling/profiler.h' is duplicated (through '@gemmlowp//:eight_bit_int_gemm_public_headers' and '@gemmlowp//:gemmlowp_headers').
INFO: Found 1 target...
INFO: From Compiling external/protobuf/src/google/protobuf/util/internal/utility.cc [for host]:
...
INFO: From Compiling tensorflow/core/distributed_runtime/tensor_coding.cc:
tensorflow/core/distributed_runtime/tensor_coding.cc: In member function 'bool tensorflow::TensorResponse::ParseTensorSubmessage(google::protobuf::io::CodedInputStream*, tensorflow::TensorProto*)':
tensorflow/core/distributed_runtime/tensor_coding.cc:123:23: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (num_bytes != buf.size()) return false;
^
ERROR: /usr/local/lib/python2.7/dist-packages/tensorflow/tensorflow/core/kernels/BUILD:1498:1: undeclared inclusion(s) in rule '//tensorflow/core/kernels:batchtospace_op_gpu':
this rule is missing dependency declarations for the following files included by 'tensorflow/core/kernels/batchtospace_op_gpu.cu.cc':
'/usr/local/cuda-8.0/include/cuda_runtime.h'
'/usr/local/cuda-8.0/include/host_config.h'
'/usr/local/cuda-8.0/include/builtin_types.h'
'/usr/local/cuda-8.0/include/device_types.h'
'/usr/local/cuda-8.0/include/host_defines.h'
'/usr/local/cuda-8.0/include/driver_types.h'
'/usr/local/cuda-8.0/include/surface_types.h'
'/usr/local/cuda-8.0/include/texture_types.h'
'/usr/local/cuda-8.0/include/vector_types.h'
'/usr/local/cuda-8.0/include/library_types.h'
'/usr/local/cuda-8.0/include/channel_descriptor.h'
'/usr/local/cuda-8.0/include/cuda_runtime_api.h'
'/usr/local/cuda-8.0/include/cuda_device_runtime_api.h'
'/usr/local/cuda-8.0/include/driver_functions.h'
'/usr/local/cuda-8.0/include/vector_functions.h'
'/usr/local/cuda-8.0/include/vector_functions.hpp'
'/usr/local/cuda-8.0/include/common_functions.h'
'/usr/local/cuda-8.0/include/math_functions.h'
'/usr/local/cuda-8.0/include/math_functions.hpp'
'/usr/local/cuda-8.0/include/math_functions_dbl_ptx3.h'
'/usr/local/cuda-8.0/include/math_functions_dbl_ptx3.hpp'
'/usr/local/cuda-8.0/include/cuda_surface_types.h'
'/usr/local/cuda-8.0/include/cuda_texture_types.h'
'/usr/local/cuda-8.0/include/device_functions.h'
'/usr/local/cuda-8.0/include/device_functions.hpp'
'/usr/local/cuda-8.0/include/device_atomic_functions.h'
'/usr/local/cuda-8.0/include/device_atomic_functions.hpp'
'/usr/local/cuda-8.0/include/device_double_functions.h'
'/usr/local/cuda-8.0/include/device_double_functions.hpp'
'/usr/local/cuda-8.0/include/sm_20_atomic_functions.h'
'/usr/local/cuda-8.0/include/sm_20_atomic_functions.hpp'
'/usr/local/cuda-8.0/include/sm_32_atomic_functions.h'
'/usr/local/cuda-8.0/include/sm_32_atomic_functions.hpp'
'/usr/local/cuda-8.0/include/sm_35_atomic_functions.h'
'/usr/local/cuda-8.0/include/sm_60_atomic_functions.h'
'/usr/local/cuda-8.0/include/sm_60_atomic_functions.hpp'
'/usr/local/cuda-8.0/include/sm_20_intrinsics.h'
'/usr/local/cuda-8.0/include/sm_20_intrinsics.hpp'
'/usr/local/cuda-8.0/include/sm_30_intrinsics.h'
'/usr/local/cuda-8.0/include/sm_30_intrinsics.hpp'
'/usr/local/cuda-8.0/include/sm_32_intrinsics.h'
'/usr/local/cuda-8.0/include/sm_32_intrinsics.hpp'
'/usr/local/cuda-8.0/include/sm_35_intrinsics.h'
'/usr/local/cuda-8.0/include/surface_functions.h'
'/usr/local/cuda-8.0/include/texture_fetch_functions.h'
'/usr/local/cuda-8.0/include/texture_indirect_functions.h'
'/usr/local/cuda-8.0/include/surface_indirect_functions.h'
'/usr/local/cuda-8.0/include/device_launch_parameters.h'
'/usr/local/cuda-8.0/include/cuda_fp16.h'
'/usr/local/cuda-8.0/include/math_constants.h'
'/usr/local/cuda-8.0/include/curand_kernel.h'
'/usr/local/cuda-8.0/include/curand.h'
'/usr/local/cuda-8.0/include/curand_discrete.h'
'/usr/local/cuda-8.0/include/curand_precalc.h'
'/usr/local/cuda-8.0/include/curand_mrg32k3a.h'
'/usr/local/cuda-8.0/include/curand_mtgp32_kernel.h'
'/usr/local/cuda-8.0/include/cuda.h'
'/usr/local/cuda-8.0/include/curand_mtgp32.h'
'/usr/local/cuda-8.0/include/curand_philox4x32_x.h'
'/usr/local/cuda-8.0/include/curand_globals.h'
'/usr/local/cuda-8.0/include/curand_uniform.h'
'/usr/local/cuda-8.0/include/curand_normal.h'
'/usr/local/cuda-8.0/include/curand_normal_static.h'
'/usr/local/cuda-8.0/include/curand_lognormal.h'
'/usr/local/cuda-8.0/include/curand_poisson.h'
'/usr/local/cuda-8.0/include/curand_discrete2.h'.
nvcc warning : option '--relaxed-constexpr' has been deprecated and replaced by option '--expt-relaxed-constexpr'.
nvcc warning : option '--relaxed-constexpr' has been deprecated and replaced by option '--expt-relaxed-constexpr'.
Target //tensorflow/tools/pip_package:build_pip_package failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 138.913s, Critical Path: 102.63s
我看到一些人抱怨bazel 0.3.1,可能需要降级到0.3.0。你给出的错误信息不多,只是父脚本说子脚本失败,控制台上应该有更多关于实际错误的信息。
两天前,我完成了GTX 1080的设置步骤,并使用了此配置。
Ubuntu 16.04
Nvidia Driver: nvidia-367.35 (installed from .run file)
Bazel 0.3.0
gcc: 4.9.3 (default with 16.04)
CUDA 8.0.27 (installed from .run file into default dirs)
compute capability: (use default values for config)