我正在尝试使用docker compose创建并启动一个只有2个容器的小型ECS Fargate集群。其中一个容器映像位于我位于DockerHub的私人存储库中。但是,命令
docker compose --file path-to-docker-compose-yml-file up
在ECS上下文中启动时一直失败,并显示错误消息:
QuarkustodoService TaskFailedToStart: ResourceInitializationError: unable to pull secrets or registry auth: execution resource retrieval failed: unable to get registry auth from asm: service call has been retried 1 time(s): failed to fetch secret arn:aws:secretsmanager:us-east-2:1071311304...
然而,我的AWS用户id的ecsTaskExecutionRole确实配备了必要的策略来获取机密并解密存储在AWS机密管理器和kms中的DockerHub凭据。我使用DockerHub用户id和访问令牌作为凭据,并验证它们是否正常工作。
有人能帮助或有想法如何调试这个问题吗?
完整的命令行输出为:
docker compose --file path-to-docker-compose.yml up
[+] Running 17/17
- quarkus-todo DeleteComplete 205.2s
- Cluster DeleteComplete 154.2s
- Quarkustodo8080TargetGroup DeleteComplete 155.5s
- CloudMap DeleteComplete 200.2s
- LogGroup DeleteComplete 157.5s
- DbTaskExecutionRole DeleteComplete 147.6s
- QuarkustodoTaskExecutionRole DeleteComplete 157.5s
- DefaultNetwork DeleteComplete 154.2s
- Quarkustodo8080Listener DeleteComplete 152.3s
- DefaultNetworkIngress DeleteComplete 82.4s
- Default8080Ingress DeleteComplete 81.2s
- DbTaskDefinition DeleteComplete 127.4s
- QuarkustodoTaskDefinition DeleteComplete 137.2s
- QuarkustodoServiceDiscoveryEntry DeleteComplete 106.1s
- DbServiceDiscoveryEntry DeleteComplete 96.4s
- DbService DeleteComplete 90.2s
- QuarkustodoService DeleteComplete 100.3s
QuarkustodoService TaskFailedToStart: ResourceInitializationError: unable to pull secrets or registry auth: execution resource retrieval failed: unable to get registry auth from asm: service call has been retried 1 time(s): failed to fetch secret arn:aws:secretsmanager:us-east-2:1071311304...
我正在使用:
Docker version 20.10.6 on Win10
aws --version
aws-cli/2.2.9 Python/3.8.8 Windows/10 exe/AMD64 prompt/off
ecs-cli --version
ecs-cli version 1.21.0 (bb0b8f0)
这是我的docker-compose.yml文件:
version: "3.8"
x-aws-vpc: "vpc-07cdb7bacc9b8010a"
x-aws-loadbalancer: "exter-Publi-FY7S28M1QL7L"
services:
quarkus-todo:
image: bergemannf/mytoolchain:quarkus-todo-ce
x-aws-pull_credentials: arn:aws:secretsmanager:us-east-2:1111111111111:secret:DockerHubAPIToken-d9RKLn
ports:
- target: 8080
x-aws-protocol: http
db:
image: postgres
environment:
POSTGRES_USER: "<some user-id>"
POSTGRES_PASSWORD: "<some pw>"
POSTGRES_DB: "<some db>"
这是我的ecs-param.yml文件:
version: 1
task_definition:
ecs_network_mode: awsvpc
task_role_arn: arn:aws:iam::107131130437:role/ECSTaskRole
task_execution_role: arn:aws:iam::1111111111:role/ecsTaskExecutionRole
task_size:
cpu_limit: 256
mem_limit: 512
pid_mode: task
ipc_mode: task
services:
quarkus-todo:
essential: true
depends_on:
- container_name: db
condition: START
init_process_enabled: false
healthcheck:
test: ["CMD", "curl -f http://localhost"]
interval: 10
timeout: 5
retries: 3
start_period: 180
secrets:
- value_from: arn:aws:secretsmanager:us-east-2:107131130437:secret:DockerHubAPIToken-d9RKLn
# name: dev/DockerHubAccessToken
db:
essential: false
efs_volumes:
- name: postgres-db-efs
filesystem_id: fs-5473872f
root_directory: /
access_point: fsap-11111111
run_params:
network_configuration:
awsvpc_configuration:
subnets:
- subnet-0af2d8c8faa7f6b9f
- subnet-039c3a3061848c2a9
security_groups:
- sg-0d52c217fa0f25cfb
assign_public_ip: ENABLED
首先,您提到了ecs-cli
和ecs-param.yml
文件,但compose/ecs集成没有利用它们(对于记录,ecs CLI已发展为AWS Copilot,并放弃了利用您正在使用的新docker compose
实现的compose兼容性(。
话虽如此,您的撰写文件实际上看起来不错。您只需要调用x-aws-pull_credentials
参数,docker就会生成一个包含所有必需布线的CFN模板。例如,这个compose文件就是我正在使用的:
version: "3.8"
services:
myweb:
image: mreferre/nginx-custom-site
x-aws-pull_credentials: arn:aws:secretsmanager:us-west-2:111111111111:secret:dockerhubAccessToken-xyzwh
ports:
- "80:80"
而且它运行得很好
$ docker compose up
[+] Running 14/14
⠿ downloads CreateComplete 190.3s
⠿ Cluster CreateComplete 6.0s
⠿ CloudMap CreateComplete 47.0s
⠿ DefaultNetwork CreateComplete 5.0s
⠿ MywebTaskExecutionRole CreateComplete 20.1s
⠿ MywebTCP80TargetGroup CreateComplete 1.0s
⠿ LogGroup CreateComplete 2.0s
⠿ LoadBalancer CreateComplete 92.0s
⠿ DefaultNetworkIngress CreateComplete 1.0s
⠿ Default80Ingress CreateComplete 1.0s
⠿ MywebTaskDefinition CreateComplete 3.0s
⠿ MywebServiceDiscoveryEntry CreateComplete 2.0s
⠿ MywebTCP80Listener CreateComplete 2.0s
⠿ MywebService CreateComplete 76.6s
$
[注意,我使用的图像不是私人的,但提取图像和登录DH的整个过程不会出错]
我唯一能想到的是,如果你的网络设置不允许你访问Secrets Manager端点?
我可以通过以下命令创建一个新的ecs机密来解决这个问题:
docker ecs secret create <yourTokenName> --username <yourUserName> --password <yourPassword>
然后,该命令输出一个AWS机密arn作为对新创建的机密的引用。我把这个arn放在码头的指定位置然后docker compose up
命令工作,ECS集群成功提升。真的很酷。