我正在尝试构建一个docker映像,可以将其用作修改Pytorch的开发环境。回购中提供了一个Dockerfile,我正在尝试以下操作:
git clone --recursive https://github.com/pytorch/pytorch
cd pytorch
DOCKER_BUILDKIT=1 docker build -t pytorchtest .
但是docker构建会导致以下错误:
...
#20 28.80 Performing C++ SOURCE FILE Test HAS_WERROR_CAST_FUNCTION_TYPE failed with the following output:
#20 28.80 Change Dir: /opt/pytorch/build/CMakeFiles/CMakeTmp
#20 28.80
#20 28.80 Run Build Command(s):/usr/bin/make -f Makefile cmTC_09005/fast && /usr/bin/make -f CMakeFiles/cmTC_09005.dir/build.make CMakeFiles/cmTC_09005.dir/build
#20 28.80 make[1]: Entering directory '/opt/pytorch/build/CMakeFiles/CMakeTmp'
#20 28.80 Building CXX object CMakeFiles/cmTC_09005.dir/src.cxx.o
#20 28.80 /usr/bin/c++ -DHAS_WERROR_CAST_FUNCTION_TYPE -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -fPIE -Werror=cast-function-type -o CMakeFiles/cmTC_09005.dir/src.cxx.o -c /opt/pytorch/build/CMakeFiles/CMakeTmp/src.cxx
#20 28.80 cc1plus: error: -Werror=cast-function-type: no option -Wcast-function-type
#20 28.80 CMakeFiles/cmTC_09005.dir/build.make:77: recipe for target 'CMakeFiles/cmTC_09005.dir/src.cxx.o' failed
#20 28.80 make[1]: *** [CMakeFiles/cmTC_09005.dir/src.cxx.o] Error 1
#20 28.80 make[1]: Leaving directory '/opt/pytorch/build/CMakeFiles/CMakeTmp'
#20 28.80 Makefile:127: recipe for target 'cmTC_09005/fast' failed
#20 28.80 make: *** [cmTC_09005/fast] Error 2
#20 28.80
#20 28.80
#20 28.80 Source file was:
#20 28.80 int main() { return 0; }
#20 DONE 29.0s
------
executor failed running [/bin/sh -c TORCH_CUDA_ARCH_LIST="3.5 5.2 6.0 6.1 7.0+PTX 8.0" TORCH_NVCC_FLAGS="-Xfatbin -compress-all"
CMAKE_PREFIX_PATH="$(dirname $(which conda))/../" python setup.py install]: exit code: 1
我无法获取错误日志,因为它们存在于映像构建过程的临时文件系统中。
我觉得有点奇怪,构建一个稳定发布的映像失败了。我做错什么了吗?
Dockerfile:
# syntax = docker/dockerfile:experimental
#
# NOTE: To build this you will need a docker version > 18.06 with
# experimental enabled and DOCKER_BUILDKIT=1
#
# If you do not use buildkit you are not going to have a good time
#
# For reference:
# https://docs.docker.com/develop/develop-images/build_enhancements/
ARG BASE_IMAGE=ubuntu:18.04
ARG PYTHON_VERSION=3.8
FROM ${BASE_IMAGE} as dev-base
RUN apt-get update && apt-get install -y --no-install-recommends
build-essential
ca-certificates
ccache
# cmake=3.10.2-1ubuntu2.18.04.2
cmake
curl
git
libjpeg-dev
libpng-dev &&
rm -rf /var/lib/apt/lists/*
RUN /usr/sbin/update-ccache-symlinks
RUN mkdir /opt/ccache && ccache --set-config=cache_dir=/opt/ccache
ENV PATH /opt/conda/bin:$PATH
FROM dev-base as conda
ARG PYTHON_VERSION=3.8
# Automatically set by buildx
ARG TARGETPLATFORM
# translating Docker's TARGETPLATFORM into miniconda arches
RUN case ${TARGETPLATFORM} in
"linux/arm64") MINICONDA_ARCH=aarch64 ;;
*) MINICONDA_ARCH=x86_64 ;;
esac &&
curl -fsSL -v -o ~/miniconda.sh -O "https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-${MINICONDA_ARCH}.sh"
COPY requirements.txt .
RUN chmod +x ~/miniconda.sh &&
~/miniconda.sh -b -p /opt/conda &&
rm ~/miniconda.sh &&
/opt/conda/bin/conda install -y python=${PYTHON_VERSION} cmake conda-build pyyaml numpy ipython &&
/opt/conda/bin/python -mpip install -r requirements.txt &&
/opt/conda/bin/conda clean -ya
FROM dev-base as submodule-update
WORKDIR /opt/pytorch
COPY . .
RUN git submodule update --init --recursive --jobs 0
FROM conda as build
WORKDIR /opt/pytorch
COPY --from=conda /opt/conda /opt/conda
COPY --from=submodule-update /opt/pytorch /opt/pytorch
RUN --mount=type=cache,target=/opt/ccache
TORCH_CUDA_ARCH_LIST="3.5 5.2 6.0 6.1 7.0+PTX 8.0" TORCH_NVCC_FLAGS="-Xfatbin -compress-all"
CMAKE_PREFIX_PATH="$(dirname $(which conda))/../"
python setup.py install || cat /opt/pytorch/build/CMakeFiles/CMakeError.log
问题出在COPY --from=submodule-update /opt/pytorch /opt/pytorch
指令上。未复制某些.bzl
文件。更确切地说,由于.dockerignore
文件,它们没有被添加到Docker构建上下文中。我在.dockerignore
的末尾添加了以下行,现在它可以工作了:
!*.bzl
据我所知,这是一个bug。这些文件已提交到repo,因此应进行复制。