你好,
我有点头疼。
我想同时创建bucket和cp批量文件。我在带有json文件的schema文件夹中有多个文件夹(datasetname(:schema/dataset1schema/dataset2schema/dataset 3
诀窍是Terraform生成桶名+随机数,以避免已经使用的名称。我有一个问题:
如何在存储桶中复制批量文件(同时创建存储桶(
resource "google_storage_bucket" "map" {
for_each = {for i, v in var.gcs_buckets: i => v}
name = "${each.value.id}_${random_id.suffix[0].hex}"
location = var.default_region
storage_class = "REGIONAL"
uniform_bucket_level_access = true
#If you destroy your bucket, this option will delete all objects inside this bucket
#if not Terrafom will fail that run
force_destroy = true
labels = {
env = var.env_label
}
resource "google_storage_bucket_object" "map" {
for_each = {for i, v in var.json_buckets: i => v}
name = ""
source = "schema/${each.value.dataset_name}/*"
bucket = contains([each.value.bucket_name], each.value.dataset_name)
#bucket = "${google_storage_bucket.map[contains([each.value.bucket_name], each.value.dataset_name)]}"
}
variable "json_buckets" {
type = list(object({
bucket_name = string
dataset_name = string
}))
default = [
{
bucket_name = "schema_table1",
dataset_name = "dataset1",
},
{
bucket_name = "schema_table2",
dataset_name = "dataset2",
},
{
bucket_name = "schema_table2",
dataset_name = "dataset3",
},
]
}
variable "gcs_buckets" {
type = list(object({
id = string
description = string
}))
default = [
{
id = "schema_table1",
description = "schema_table1",
},
]
}
...
为什么有bucket = contains([each.value.bucket_name], each.value.dataset_name)
?contains函数返回一个bool
,bucket
接受一个字符串输入(bucket的名称(。
没有任何资源允许您一次将多个对象复制到bucket中。如果需要在Terraform中执行此操作,可以使用fileset函数获取目录中的文件列表,然后在for_each
中使用该列表作为google_storage_bucket_object
。它可能看起来像这样(未经测试(:
locals {
// Create a master list that has all files for all buckets
all_files = merge([
// Loop through each bucket/dataset combination
for bucket_idx, bucket_data in var.json_buckets:
{
// For each bucket/dataset combination, get a list of all files in that dataset
for file in fileset("schema/${bucket_data.dataset_name}/", "**"):
// And stick it in a map of all bucket/file combinations
"bucket-${bucket_idx}-${file}" => merge(bucket_data, {
file_name = file
})
}
]...)
}
resource "google_storage_bucket_object" "map" {
for_each = local.all_files
name = each.value.file_name
source = "schema/${each.value.dataset_name}/${each.value.file_name}"
bucket = each.value.bucket_name
}
警告:如果您有很多文件要上传,请不要这样做。这将在Terraform状态文件中为每个上载的文件创建一个资源,这意味着每次运行terraform plan
或terraform apply
时,它都会调用API来检查每个上载文件的状态。如果你有数百个文件要上传,它会变得非常慢,非常快。
如果要上载大量文件,请考虑在创建存储桶后使用基于外部CLI的工具将本地文件与远程存储桶同步。您可以使用像这样的模块来运行外部CLI命令。