构建特定docker映像时出现问题
图像名称为:763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-training:1.10.2-transformers4.17.0-gpu-py38-cu113-ubuntu20.04:latest
。此图片名称来自AWS的官方图片列表:https://github.com/aws/deep-learning-containers/blob/master/available_images.md
这是Dockerfile:
ARG AWS_ACCOUNT_ID
ARG AWS_DEFAULT_REGION
FROM 763104351884.dkr.ecr.${AWS_DEFAULT_REGION}.amazonaws.com/huggingface-pytorch-training:1.10.2-transformers4.17.0-gpu-py38-cu113-ubuntu20.04:latest
运行了以下命令:
aws ecr get-login-password --region ${AWS_DEFAULT_REGION} | docker login --username AWS --password-stdin 763104351884.dkr.ecr.${AWS_DEFAULT_REGION}.amazonaws.com
docker compose build --build-arg AWS_ACCOUNT_ID=763104351884 --build-arg
错误如下:
AWS_DEFAULT_REGION=us-east-1 --no-cache
[+] Building 0.2s (2/2) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 519B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
failed to solve: failed to solve with frontend dockerfile.v0: failed to create LLB definition: failed to parse stage name "763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-training:1.10.2-transformers4.17.0-gpu-py38-cu113-ubuntu20.04:latest": invalid reference format
解决方案是从图像名称中删除尾随的:latest
。Docker映像名称的字符串中不能有两个冒号(:(,这是导致此问题的原因。
所以一旦我改变了763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-training:1.10.2-transformers4.17.0-gpu-py38-cu113-ubuntu20.04:latest
->763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-training:1.10.2-transformers4.17.0-gpu-py38-cu113-ubuntu20.04
容器开始构建。