我正在尝试将一个配置单元表放在我基于以下 json 内容创建的镶木地板表的顶部:
{"user_id":"4513","providers":[{"id":"4220","name":"dbmvl","behaviors":{"b1":"gxybq","b2":"ntfmx"}},{"id":"4173","name":"dvjke","behaviors":{"b1":"sizow","b2":"knuuc"}}]}
{"user_id":"3960","providers":[{"id":"1859","name":"ponsv","behaviors":{"b1":"ahfgc","b2":"txpea"}},{"id":"103","name":"uhqqo","Behaviors":{"B1":"lktyo","B2":"iTuxy"}}]}
{"user_id":"567","providers":[{"id":"9622","name":"crjju","behaviors":{"b1":"rhaqc","b2":"npnot"}},{"id":"6965","name":"fnheh","behaviors":{"b1":"eipse","b2":"nvxqk"}}]}
我基本上使用 spark sql 来读取 json 并写出一个镶木地板文件。
我在将 Hive 放在生成的镶木地板文件之上时遇到了问题。 这是我拥有的蜂巢hql:
create table test (mycol STRUCT<user_id:String, providers:ARRAY<STRUCT<id:String, name:String, behaviors:MAP<String, String>>>>) stored as parquet;
Alter table test set location 'hdfs:///tmp/test.parquet';
上述语句执行正常,但是当我尝试在表上进行选择 * 时出现错误:
失败,出现异常 java.io.IOException:java.lang.IllegalStateException: 索引 0 处的列 mycol 在 {providers=providers, user_id=user_id} 中不存在
尝试将查询更改为:
create table test (user_id:String, providers:ARRAY<STRUCT<id:String, name:String, behaviors:MAP<String, String>>>) stored as parquet;
存储 Parquet 文件时,根 JSON 对象将展平化。