在Power9 w/GPU上使用LightGBM 2.2.4、Boost 1.64.0构建问题



我正试图在运行Red Hat Enterprise Server 7.5(Maipo)的IBM Power9系统("Witherspoon",CPU是电源系统AC9228335-GTH)上构建LightGBM 2.2.4版本(git hash 5256cda69300d6b83b18180da2992a1e50a6b392)。

我使用的是RHEL打包的C编译器,gcc 4.8.5,cmake的本地版本3.13.1,以及Boost版本1.64.0的本地安装。系统安装了CUDA 9.2,我已经找到了libOpenCL目录和include文件。

我的配置操作是(从解压缩的LightGBM树的根目录中新创建的构建目录中):

# export BOOST_ROOT=/share/sw/boost/1_64_0/ 
# cmake3 -DUSE_GPU=1 -DOpenCL_LIBRARY=/usr/lib64/nvidia/libOpenCL.so -DOpenCL_INCLUDE_DIR=/usr/include/CL/ .. 
# make

配置步骤显然成功了,生成了一个可运行的makefile。

构建失败率约为41%,错误来自Boost:的内部深处


[ 41%] Building CXX object CMakeFiles/_lightgbm.dir/src/treelearner/data_parallel_tree_learner.cpp.o
In file included from /share/sw/boost/1_64_0/include/boost/mpl/aux_/integral_wrapper.hpp:22:0,
from /share/sw/boost/1_64_0/include/boost/mpl/int.hpp:20,
from /share/sw/boost/1_64_0/include/boost/mpl/lambda_fwd.hpp:23,
from /share/sw/boost/1_64_0/include/boost/mpl/aux_/na_spec.hpp:18,
from /share/sw/boost/1_64_0/include/boost/mpl/identity.hpp:17,
from /share/sw/boost/1_64_0/include/boost/iterator/detail/enable_if.hpp:11,
from /share/sw/boost/1_64_0/include/boost/iterator/transform_iterator.hpp:11,
from /share/sw/boost/1_64_0/include/boost/algorithm/string/iter_find.hpp:17,
from /share/sw/boost/1_64_0/include/boost/algorithm/string/split.hpp:16,
from /wrk/user/src/lightgbm/LightGBM/compute/include/boost/compute/device.hpp:18,
from /wrk/user/src/lightgbm/LightGBM/compute/include/boost/compute/context.hpp:19,
from /wrk/user/src/lightgbm/LightGBM/compute/include/boost/compute/buffer.hpp:15,
from /wrk/user/src/lightgbm/LightGBM/compute/include/boost/compute/core.hpp:18,
from /wrk/user/src/lightgbm/LightGBM/src/treelearner/gpu_tree_learner.h:27,
from /wrk/user/src/lightgbm/LightGBM/src/treelearner/parallel_tree_learner.h:5,
from /wrk/user/src/lightgbm/LightGBM/src/treelearner/data_parallel_tree_learner.cpp:1:
/share/sw/boost/1_64_0/include/boost/mpl/vector.hpp:28:18: error: pasting ")" and "20" does not give a valid preprocessing token
BOOST_PP_CAT(vector, BOOST_MPL_LIMIT_VECTOR_SIZE).hpp 
^
/share/sw/boost/1_64_0/include/boost/preprocessor/cat.hpp:29:34: note: in definition of macro ‘BOOST_PP_CAT_I’
#    define BOOST_PP_CAT_I(a, b) a ## b
^
/share/sw/boost/1_64_0/include/boost/mpl/vector.hpp:28:5: note: in expansion of macro ‘BOOST_PP_CAT’
BOOST_PP_CAT(vector, BOOST_MPL_LIMIT_VECTOR_SIZE).hpp 
^
/share/sw/boost/1_64_0/include/boost/mpl/vector.hpp:36:49: note: in expansion of macro ‘AUX778076_VECTOR_HEADER’
#   include BOOST_PP_STRINGIZE(boost/mpl/vector/AUX778076_VECTOR_HEADER)
^
In file included from /share/sw/boost/1_64_0/include/boost/math/policies/policy.hpp:14:0,
from /share/sw/boost/1_64_0/include/boost/math/special_functions/math_fwd.hpp:28,
from /share/sw/boost/1_64_0/include/boost/math/special_functions/sign.hpp:17,
from /share/sw/boost/1_64_0/include/boost/lexical_cast/detail/inf_nan.hpp:34,
from /share/sw/boost/1_64_0/include/boost/lexical_cast/detail/converter_lexical_streams.hpp:63,
from /share/sw/boost/1_64_0/include/boost/lexical_cast/detail/converter_lexical.hpp:54,
from /share/sw/boost/1_64_0/include/boost/lexical_cast/try_lexical_convert.hpp:42,
from /share/sw/boost/1_64_0/include/boost/lexical_cast.hpp:32,
from /wrk/user/src/lightgbm/LightGBM/compute/include/boost/compute/detail/meta_kernel.hpp:23,
from /wrk/user/src/lightgbm/LightGBM/compute/include/boost/compute/iterator/buffer_iterator.hpp:26,
from /wrk/user/src/lightgbm/LightGBM/compute/include/boost/compute/algorithm/detail/copy_on_device.hpp:18,
from /wrk/user/src/lightgbm/LightGBM/compute/include/boost/compute/algorithm/copy.hpp:26,
from /wrk/user/src/lightgbm/LightGBM/compute/include/boost/compute/container/vector.hpp:32,
from /wrk/user/src/lightgbm/LightGBM/src/treelearner/gpu_tree_learner.h:28,
from /wrk/user/src/lightgbm/LightGBM/src/treelearner/parallel_tree_learner.h:5,
from /wrk/user/src/lightgbm/LightGBM/src/treelearner/data_parallel_tree_learner.cpp:1:
/share/sw/boost/1_64_0/include/boost/mpl/vector.hpp:36:73: fatal error: boost/mpl/__attribute__((altivec(vector__)))/__attribute__((altivec(vector__)))20.hpp: No such file or directory
#   include BOOST_PP_STRINGIZE(boost/mpl/vector/AUX778076_VECTOR_HEADER)

从消息中看,似乎有一些预处理器字符串操作出错,它可能试图在boot/mpl/vvector include目录中找到"vector20.hpp"文件,但BOOST_PP_CAT操作出错,所以无法构造正确的文件名?此外,还涉及"altivec",Power9 CPU具有altivec功能,可能需要额外的标头或编译器开关?

我可以在Debian 9"stretch"系统上成功构建(带有警告),该系统具有x86_64架构和CUDA 9.1(用于libOpenCL的东西),带有Debian封装的Boost 1.62版本。

我还尝试针对Boost 1.69和Boost 1.62(Debian上的版本)构建Power9版本,但在同一个地方也出现了同样的错误。

帮助?

这在LightGBM github上的一个问题中得到了解决,我在最初的搜索中不知怎么错过了这个问题。

这种构建尝试被误导了。

编译问题显然是altivec/boost交互,Power架构上没有OpenCL GPU支持,而LightGBM是OpenCL的幕后推手,所以无论如何,这项工作都是注定要失败的。

最新更新