我想使用 apache drill 生成一个具有非常特定模式的镶木地板文件。我使用 CTAS 连接两个表,例如:
CREATE TABLE synthetic1 AS (
SELECT e1.returneddocids AS returneddocids, e1.pathinfo AS pathinfo, c1.counters AS counters
FROM dfs.`/tmp/tier1.parquet` e1 LEFT JOIN dfs.tmp.shadow3 c1 ON TRUE LIMIT 100
);
生成的文件架构如下所示:
message root {
optional group returneddocids {
repeated group list {
optional binary element (UTF8); // need this one as required, not optional
}
}
optional binary pathinfo (UTF8);
optional group counters {
repeated group list {
optional group element { // need this as required
optional binary name (UTF8); // need this as required
optional int32 value; // need this as required
}
}
}
}
我想知道如何调整 CTAS 查询optional
以便将上面的元素更改为required
?
这非常复杂,您可以使用创建或替换架构来应用约束。就我而言,这种工作(不完全是,尽管可能对其他遇到类似问题的人有所帮助(:
ALTER SESSION SET `store.table.use_schema_file` = true;
ALTER SESSION SET `exec.storage.enable_v3_text_reader` = true;
CREATE OR REPLACE SCHEMA (
returneddocids STRUCT<`list` STRUCT<`element` ARRAY<VARCHAR>>> NOT NULL,
pathinfo VARCHAR NOT NULL,
counters STRUCT<`list` STRUCT<`element` ARRAY<VARCHAR>>>
) FOR TABLE synthetic1;