如果我有这种结构的火花模式,
root
|-- id: long (nullable = true)
|-- firstname: string (nullable = true)
|-- lastname: string (nullable = true)
|-- orders: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- id: long (nullable = true)
| | |-- price: double (nullable = true)
| | |-- userid: long (nullable = true)
如何使用此模式创建表,我尝试了这个查询,
CREATE TABLE iceberg.test.order (
id BIGINT,
firstName VARCHAR,
lastName VARCHAR,
orders ROW(
id BIGINT,
price double,
userid BIGINT
)
)
WITH (
format = 'PARQUET'
)
如果您正在尝试创建一个表。要读取s3上现有的parquet文件,语法如下所示。我根据您显示的模式对其进行了调整:orders是一个结构数组。
create external table mytable (
id bigint,
firstname varchar,
lastname varchar,
orders array[struct<id:bigint, price:double, userid:bigint>]
) stored as parquet
location 's3://...'
以下是更多示例:https://aws.amazon.com/blogs/big-data/create-tables-in-amazon-athena-from-nested-json-and-mappings-using-jsonserde/