terraform不断使用默认参数强制容器定义使用新的资源/强制替换



我提出了具有以下地形配置的aws_ecs_task_defintion

我通过local.image_tag作为变量来控制我们的ecr图像通过terraform的部署。

我能够在初始地形计划/应用周期中调出ecs_cluster。

然而,在随后的terraform计划/应用周期中,terraform正在强制执行新的容器定义,这就是为什么即使我们的ecr映像local.image_tag保持相同的,也要重新部署整个任务定义

这种行为导致了意外的任务定义回收,而没有对ecr图像进行任何更改,只是使用默认值强制地形值。

TF配置

resource "aws_ecs_task_definition" "this_task" {
family                   = "this-service"
execution_role_arn       = var.this_role
task_role_arn            = var.this_role
network_mode             = "awsvpc"
requires_compatibilities = ["FARGATE"]
cpu                      = 256
memory                   = var.env != "prod" ? 512 : 1024
tags                     = local.common_tags
# Log the to datadog if it's running in the prod account.
container_definitions = (
<<TASK_DEFINITION
[
{
"essential": true,
"image": "AWS_ACCOUNT_ID.dkr.ecr.us-west-2.amazonaws.com/thisisservice:${local.image_tag}",
"environment" :[
{"name":"ID", "value":"${jsondecode(data.aws_secretsmanager_secret_version.this_decrypt.secret_string)["id"]}"},
{"name":"SECRET","value":"${jsondecode(data.aws_secretsmanager_secret_version.this_decrypt.secret_string)["secret"]}"},
{"name":"THIS_SOCKET_URL","value":"${local.websocket_url}"},
{"name":"THIS_PLATFORM_API","value":"${local.platform_api}"},
{"name":"REDISURL","value":"${var.redis_url}"},
{"name":"BASE_S3","value":"${aws_s3_bucket.ec2_vp.id}"}
],
"name": "ec2-vp",
"logConfiguration": {
"logDriver": "awsfirelens",
"options": {
"Name": "datadog",
"apikey": "${jsondecode(data.aws_secretsmanager_secret_version.datadog_api_key[0].secret_string)["api_key"]}",
"Host": "http-intake.logs.datadoghq.com",
"dd_service": "this",
"dd_source": "this",
"dd_message_key": "log",
"dd_tags": "cluster:${var.cluster_id},Env:${var.env}",
"TLS": "on",
"provider": "ecs"
}
},
"portMappings": [
{
"containerPort": 443,
"hostPort": 443
}
]
},
{
"essential": true,
"image": "amazon/aws-for-fluent-bit:latest",
"name": "log_router",
"firelensConfiguration": {
"type": "fluentbit",
"options": { "enable-ecs-log-metadata": "true" }
}

}
]
TASK_DEFINITION
)
}


-/+ resource "aws_ecs_task_definition" "this_task" {
~ arn                      = "arn:aws:ecs:ca-central-1:AWS_ACCOUNT_ID:task-definition/this:4" -> (known after apply)
~ container_definitions    = jsonencode(
~ [ # forces replacement
~ {
- cpu              = 0 -> null
environment      = [
{
name  = "BASE_S3"
value = "thisisthevalue"
},
{
name  = "THIS_PLATFORM_API"
value = "thisisthevlaue"
},
{
name  = "SECRET"
value = "thisisthesecret"
},
{
name  = "ID"
value = "thisistheid"
},
{
name  = "THIS_SOCKET_URL"
value = "thisisthevalue"
},
{
name  = "REDISURL"
value = "thisisthevalue"
},
]
essential        = true
image            = "AWS_ACCOUNT_ID.dkr.ecr.us-west-2.amazonaws.com/this:v1.0.0-develop.6"
logConfiguration = {
logDriver = "awsfirelens"
options   = {
Host           = "http-intake.logs.datadoghq.com"
Name           = "datadog"
TLS            = "on"
apikey         = "thisisthekey"
dd_message_key = "log"
dd_service     = "this"
dd_source      = "this"
dd_tags        = "thisisthetags"
provider       = "ecs"
}
}
- mountPoints      = [] -> null
name             = "ec2-vp"
~ portMappings     = [
~ {
containerPort = 443
hostPort      = 443
- protocol      = "tcp" -> null
},
]
- volumesFrom      = [] -> null
} # forces replacement,
~ {
- cpu                   = 0 -> null
- environment           = [] -> null
essential             = true
firelensConfiguration = {
options = {
enable-ecs-log-metadata = "true"
}
type    = "fluentbit"
}
image                 = "amazon/aws-for-fluent-bit:latest"
- mountPoints           = [] -> null
name                  = "log_router"
- portMappings          = [] -> null
- user                  = "0" -> null
- volumesFrom           = [] -> null
} # forces replacement,
]
)
cpu                      = "256"
execution_role_arn       = "arn:aws:iam::AWS_ACCOUNTID:role/thisistherole"
family                   = "this"
~ id                       = "this-service" -> (known after apply)
memory                   = "512"
network_mode             = "awsvpc"
requires_compatibilities = [
"FARGATE",
]
~ revision                 = 4 -> (known after apply)
tags                     = {
"Cluster"      = "this"
"Env"          = "this"
"Name"         = "this"
"Owner"        = "this"
"Proj"         = "this"
"SuperCluster" = "this"
"Terraform"    = "true"
}
task_role_arn            = "arn:aws:iam::AWS_ACCOUNT+ID:role/thisistherole"
}

上面是强制新任务定义/容器定义的地形图。

正如您所看到的,terraform正在用null或empty替换所有默认值。我仔细检查了它从上一次运行中生成的terraform.tfstate文件,这些值与上面的计划中显示的完全相同。

我不知道为什么会发生这种意想不到的行为,我想知道如何解决这个问题。

我使用的是terraform 0.12.25和最新的terraform aws提供商。

此问题存在已知的terraform aws提供程序错误。

为了使terraform不取代正在运行的任务/容器定义,我必须用null或空的配置集填写它在terraform计划上显示的所有默认值。

填写完所有参数后,我再次运行teraferm计划/应用循环,以确保它不会像以前那样替换容器定义。

当我将aws-for-fluent-bit作为sidecar容器时,我也遇到了同样的问题。在这个容器定义中添加"user": "0"是最不可能阻止任务定义被强制重新创建的事情。

{
"name": "log_router",
"image": "public.ecr.aws/aws-observability/aws-for-fluent-bit:latest",
"logConfiguration": null,
"firelensConfiguration": {
"type": "fluentbit",
"options": {
"enable-ecs-log-metadata": "true"
}
},
"user": "0"
}

就像前面的回答和评论中提到的那样,到目前为止,任务定义通常会强制替换,并显示不可靠的差异。

然而,我设法通过AWS提供商registry.terraform.io/hashicorp/aws v5.13.1避免了类似v1.3.7的问题

  • 创建一个单独的resource "aws_ecs_task_definition" "example",并在相应的aws_ecs_service上设置配置task_definition = aws_ecs_task_definition.example.arn
  • aws_ecs_task_definition内部,使用jsonencode而不是templatefile,或类似于<<TASK_DEFINITION的任何内容

然而,这依赖于latest始终是正确的Docker映像版本或您在代码上设置的任何版本。除了latest之外的任何东西都需要一些参数操作才能工作。

关于jsonencode的更多信息:https://developer.hashicorp.com/terraform/language/functions/jsonencode

作为解决方案,您可以忽略container_definitions中的任何更改,并将local.image_tag添加到replace_triggered_by:

resource "aws_ecs_task_definition" "this_task" {
family                = "this-service"
container_definitions = local.container_definitions
...
lifecycle {
replace_triggered_by = [local.image_tag]
ignore_changes       = [container_definitions]
}
}

我不确定我们是否可以将局部变量传递给replace_triggerd_by,但您可以尝试
点击此处了解有关lifecycle的更多信息https://developer.hashicorp.com/terraform/language/meta-arguments/lifecycle#replace_triggered_by

最新更新