雪花云数据平台我可以在阶段路径中使用变量吗?



我使用COPY INTO语句将一些表加载到S3:

COPY INTO 's3://sandbox-staging/US/'
FROM US
storage_integration = sandbox
FILE_FORMAT = (
type = 'parquet'
)
header = true
overwrite = true;

我必须为每个状态做这样的迁移。为了节省一些时间并防止人为错误,我希望将表名设置为变量,以便可以在COPY INTOFROM子句中使用它。例如:

SET loc = 'US_NY';
SET staging_path = 's3://sandbox-staging/' || $loc || '/';
COPY INTO $staging_path
FROM table($loc)
storage_integration = sandbox
FILE_FORMAT = (
type = 'parquet'
)
header = true
overwrite = true;

FROM条款工作,这是COPY INTO我似乎不能得到正确的。在同样的意义上,有一个table函数表字面量,有任何字面量函数,我可以使用暂存路径?

您可以尝试使用带有execute immediate的变量来动态生成命令。https://docs.snowflake.com/en/sql-reference/sql/execute-immediate.html

SET loc = 'US_NY';
SET staging_path =  '''s3://sandbox-staging/' || $loc || '/''' ;
SET copy_command= 
'COPY INTO ' || $staging_path ||
' FROM  '  || $loc ||
' storage_integration = sandbox
FILE_FORMAT = (
type = 'parquet'
)
header = true
overwrite = true;';
EXECUTE IMMEDIATE $copy_command;

要查看复制命令代码,可以运行:

SELECT $copy_command;

输出:

COPY INTO 's3://sandbox-staging/US_NY/' FROM US_NY storage_integration = sandbox FILE_FORMAT = ( type = 'parquet' ) header = true overwrite = true;

回到最初的需求-在所有状态下运行这个-这是为SQL生成器量身定制的。

create or replace table TABLE_LIST(NAME string);
insert into TABLE_LIST (NAME) values ('US_NY'), ('US_CA'), ('US_NC'), ('US_FL');
select $$
COPY INTO 's3://sandbox-staging/$$ || NAME || $$/'$$ || $$
FROM $$ || NAME || $$
storage_integration = sandbox
FILE_FORMAT = (
type = 'parquet'
)
header = true
overwrite = true
$$ as SQL_COMMAND
from TABLE_LIST;

将生成一个表中的所有SQL命令。如果希望自动运行它们,可以使用存储过程来实现。对于运行生成的SQL语句,已经有一个可以这样做了。

https://snowflake.pavlik.us/index.php/2019/08/22/executing-multiple-sql-statements-in-a-stored-procedure/

你可以这样调用它:

call RunBatchSQL($$ 
select 'COPY INTO ''s3://sandbox-staging/' || NAME || ''' ||
' FROM ' || NAME ||
' storage_integration = sandbox
FILE_FORMAT = (
type = ''parquet''
)
header = true
overwrite = true'
as SQL_COMMAND
from TABLE_LIST;
$$);

从头开始编写存储过程可能更简洁,但这允许您运行任何生成的SQL语句。

最新更新