我提出了具有以下地形配置的aws_ecs_task_defintion
。
我通过local.image_tag
作为变量来控制我们的ecr图像通过terraform的部署。
我能够在初始地形计划/应用周期中调出ecs_cluster。
然而,在随后的terraform计划/应用周期中,terraform正在强制执行新的容器定义,这就是为什么即使我们的ecr映像local.image_tag
保持相同的,也要重新部署整个任务定义
这种行为导致了意外的任务定义回收,而没有对ecr图像进行任何更改,只是使用默认值强制地形值。
TF配置
resource "aws_ecs_task_definition" "this_task" {
family = "this-service"
execution_role_arn = var.this_role
task_role_arn = var.this_role
network_mode = "awsvpc"
requires_compatibilities = ["FARGATE"]
cpu = 256
memory = var.env != "prod" ? 512 : 1024
tags = local.common_tags
# Log the to datadog if it's running in the prod account.
container_definitions = (
<<TASK_DEFINITION
[
{
"essential": true,
"image": "AWS_ACCOUNT_ID.dkr.ecr.us-west-2.amazonaws.com/thisisservice:${local.image_tag}",
"environment" :[
{"name":"ID", "value":"${jsondecode(data.aws_secretsmanager_secret_version.this_decrypt.secret_string)["id"]}"},
{"name":"SECRET","value":"${jsondecode(data.aws_secretsmanager_secret_version.this_decrypt.secret_string)["secret"]}"},
{"name":"THIS_SOCKET_URL","value":"${local.websocket_url}"},
{"name":"THIS_PLATFORM_API","value":"${local.platform_api}"},
{"name":"REDISURL","value":"${var.redis_url}"},
{"name":"BASE_S3","value":"${aws_s3_bucket.ec2_vp.id}"}
],
"name": "ec2-vp",
"logConfiguration": {
"logDriver": "awsfirelens",
"options": {
"Name": "datadog",
"apikey": "${jsondecode(data.aws_secretsmanager_secret_version.datadog_api_key[0].secret_string)["api_key"]}",
"Host": "http-intake.logs.datadoghq.com",
"dd_service": "this",
"dd_source": "this",
"dd_message_key": "log",
"dd_tags": "cluster:${var.cluster_id},Env:${var.env}",
"TLS": "on",
"provider": "ecs"
}
},
"portMappings": [
{
"containerPort": 443,
"hostPort": 443
}
]
},
{
"essential": true,
"image": "amazon/aws-for-fluent-bit:latest",
"name": "log_router",
"firelensConfiguration": {
"type": "fluentbit",
"options": { "enable-ecs-log-metadata": "true" }
}
}
]
TASK_DEFINITION
)
}
-/+ resource "aws_ecs_task_definition" "this_task" {
~ arn = "arn:aws:ecs:ca-central-1:AWS_ACCOUNT_ID:task-definition/this:4" -> (known after apply)
~ container_definitions = jsonencode(
~ [ # forces replacement
~ {
- cpu = 0 -> null
environment = [
{
name = "BASE_S3"
value = "thisisthevalue"
},
{
name = "THIS_PLATFORM_API"
value = "thisisthevlaue"
},
{
name = "SECRET"
value = "thisisthesecret"
},
{
name = "ID"
value = "thisistheid"
},
{
name = "THIS_SOCKET_URL"
value = "thisisthevalue"
},
{
name = "REDISURL"
value = "thisisthevalue"
},
]
essential = true
image = "AWS_ACCOUNT_ID.dkr.ecr.us-west-2.amazonaws.com/this:v1.0.0-develop.6"
logConfiguration = {
logDriver = "awsfirelens"
options = {
Host = "http-intake.logs.datadoghq.com"
Name = "datadog"
TLS = "on"
apikey = "thisisthekey"
dd_message_key = "log"
dd_service = "this"
dd_source = "this"
dd_tags = "thisisthetags"
provider = "ecs"
}
}
- mountPoints = [] -> null
name = "ec2-vp"
~ portMappings = [
~ {
containerPort = 443
hostPort = 443
- protocol = "tcp" -> null
},
]
- volumesFrom = [] -> null
} # forces replacement,
~ {
- cpu = 0 -> null
- environment = [] -> null
essential = true
firelensConfiguration = {
options = {
enable-ecs-log-metadata = "true"
}
type = "fluentbit"
}
image = "amazon/aws-for-fluent-bit:latest"
- mountPoints = [] -> null
name = "log_router"
- portMappings = [] -> null
- user = "0" -> null
- volumesFrom = [] -> null
} # forces replacement,
]
)
cpu = "256"
execution_role_arn = "arn:aws:iam::AWS_ACCOUNTID:role/thisistherole"
family = "this"
~ id = "this-service" -> (known after apply)
memory = "512"
network_mode = "awsvpc"
requires_compatibilities = [
"FARGATE",
]
~ revision = 4 -> (known after apply)
tags = {
"Cluster" = "this"
"Env" = "this"
"Name" = "this"
"Owner" = "this"
"Proj" = "this"
"SuperCluster" = "this"
"Terraform" = "true"
}
task_role_arn = "arn:aws:iam::AWS_ACCOUNT+ID:role/thisistherole"
}
上面是强制新任务定义/容器定义的地形图。
正如您所看到的,terraform正在用null或empty替换所有默认值。我仔细检查了它从上一次运行中生成的terraform.tfstate文件,这些值与上面的计划中显示的完全相同。
我不知道为什么会发生这种意想不到的行为,我想知道如何解决这个问题。
我使用的是terraform 0.12.25和最新的terraform aws提供商。
此问题存在已知的terraform aws提供程序错误。
为了使terraform不取代正在运行的任务/容器定义,我必须用null或空的配置集填写它在terraform计划上显示的所有默认值。
填写完所有参数后,我再次运行teraferm计划/应用循环,以确保它不会像以前那样替换容器定义。
当我将aws-for-fluent-bit
作为sidecar容器时,我也遇到了同样的问题。在这个容器定义中添加"user": "0"
是最不可能阻止任务定义被强制重新创建的事情。
{
"name": "log_router",
"image": "public.ecr.aws/aws-observability/aws-for-fluent-bit:latest",
"logConfiguration": null,
"firelensConfiguration": {
"type": "fluentbit",
"options": {
"enable-ecs-log-metadata": "true"
}
},
"user": "0"
}
就像前面的回答和评论中提到的那样,到目前为止,任务定义通常会强制替换,并显示不可靠的差异。
然而,我设法通过AWS提供商registry.terraform.io/hashicorp/aws v5.13.1
避免了类似v1.3.7
的问题
- 创建一个单独的
resource "aws_ecs_task_definition" "example"
,并在相应的aws_ecs_service
上设置配置task_definition = aws_ecs_task_definition.example.arn
- 在
aws_ecs_task_definition
内部,使用jsonencode
而不是templatefile
,或类似于<<TASK_DEFINITION
的任何内容
然而,这依赖于latest
始终是正确的Docker映像版本或您在代码上设置的任何版本。除了latest
之外的任何东西都需要一些参数操作才能工作。
关于jsonencode
的更多信息:https://developer.hashicorp.com/terraform/language/functions/jsonencode
作为解决方案,您可以忽略container_definitions
中的任何更改,并将local.image_tag
添加到replace_triggered_by
:
resource "aws_ecs_task_definition" "this_task" {
family = "this-service"
container_definitions = local.container_definitions
...
lifecycle {
replace_triggered_by = [local.image_tag]
ignore_changes = [container_definitions]
}
}
我不确定我们是否可以将局部变量传递给replace_triggerd_by,但您可以尝试
点击此处了解有关lifecycle
的更多信息https://developer.hashicorp.com/terraform/language/meta-arguments/lifecycle#replace_triggered_by