我正在ECS集群上运行Docker映像,以便shell进入它并运行一些简单的测试。然而,当我运行这个:
aws ecs execute-command
--cluster MyEcsCluster
--task $ECS_TASK_ARN
--container MainContainer
--command "/bin/bash"
--interactive
我得到错误:
The Session Manager plugin was installed successfully. Use the AWS CLI to start a session.
An error occurred (TargetNotConnectedException) when calling the ExecuteCommand operation: The execute command failed due to an internal error. Try again later.
我可以确认任务+容器+代理都在运行:
aws ecs describe-tasks
--cluster MyEcsCluster
--tasks $ECS_TASK_ARN
| jq '.'
"containers": [
{
"containerArn": "<redacted>",
"taskArn": "<redacted>",
"name": "MainContainer",
"image": "confluentinc/cp-kafkacat",
"runtimeId": "<redacted>",
"lastStatus": "RUNNING",
"networkBindings": [],
"networkInterfaces": [
{
"attachmentId": "<redacted>",
"privateIpv4Address": "<redacted>"
}
],
"healthStatus": "UNKNOWN",
"managedAgents": [
{
"lastStartedAt": "2021-09-20T16:26:44.540000-05:00",
"name": "ExecuteCommandAgent",
"lastStatus": "RUNNING"
}
],
"cpu": "0",
"memory": "4096"
}
],
我正在用CDK Typescript代码定义ECS集群和任务定义:
new Cluster(stack, `MyEcsCluster`, {
vpc,
clusterName: `MyEcsCluster`,
})
const taskDefinition = new FargateTaskDefinition(stack, TestTaskDefinition`, {
family: `TestTaskDefinition`,
cpu: 512,
memoryLimitMiB: 4096,
})
taskDefinition.addContainer("MainContainer", {
image: ContainerImage.fromRegistry("confluentinc/cp-kafkacat"),
command: ["tail", "-F", "/dev/null"],
memoryLimitMiB: 4096,
// Some internet searches suggested setting this flag. This didn't seem to help.
readonlyRootFilesystem: false,
})
ECS Exec Checker应该能够找出你的设置出了什么问题。你能试一下吗?
check- ECS - Exec .sh脚本允许您通过代表您调用各种AWS api来检查和验证您的CLI环境和ECS集群/任务是否为ECS Exec做好了准备。
基于@clay的评论
我还缺少ssmmessages:*
权限。
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-exec.html#ecs-exec-required-iam-permissions表示像
这样的策略{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ssmmessages:CreateControlChannel",
"ssmmessages:CreateDataChannel",
"ssmmessages:OpenControlChannel",
"ssmmessages:OpenDataChannel"
],
"Resource": "*"
}
]
}
应该附加到您的"任务角色"中使用的角色。(不是"任务执行角色"),尽管唯一的ssmmessages:CreateDataChannel
权限可以削减它。
管理的策略
arn:aws:iam::aws:policy/AmazonSSMFullAccess
arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
arn:aws:iam::aws:policy/AmazonSSMManagedEC2InstanceDefaultPolicy
arn:aws:iam::aws:policy/AWSCloud9SSMInstanceProfile
都包含必要的权限,AWSCloud9SSMInstanceProfile
是最简约的。
我意识到我的上级组织限制了ssmmessages权限,当白名单出现时,在新任务启动后解决了我的问题。