我试图运行一个alter table命令,使用直连到sparkt thrift服务器,该服务器配置为使用远程hive metastore。我得到以下错误:
表是用以下命令创建的:
CREATE TABLE `test_schema`.`nested_test_pq` (`key1` ARRAY<STRUCT<`a`: STRING, `b`: STRING, `c`: STRING>>) USING parquet
修改表命令
alter table test_schema.nested_test_pq change key1 key1 type array<struct<a:string,b:string,c:string,d:string>>;
Error is
alter table test_schema.nested_test_pq change key1 key1 array<struct<a:string,b:string,c:string,d:string>>; Error: org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.AnalysisException: ALTER TABLE CHANGE COLUMN is not supported for changing column 'key1' with type 'ArrayType(StructType(StructField(a,StringType,true), StructField(b,StringType,true), StructField(c,StringType,true)),true)' to 'key1' with type 'ArrayType(StructType(StructField(a,StringType,true), StructField(b,StringType,true), StructField(c,StringType,true), StructField(d,StringType,t rue)),true)' at org.apache.spark.sql.hive.thriftserver.HiveThriftServerErrors$.runningQueryError(HiveThriftServerErrors.scala:43)
我已经尝试了配置hive.metastore.disallow.invalid.col.type.changes
到true
,但没有运气。
Spark version - 3.2.1
Hive metastore version - 3.0.0
Hadoop版本- 3.2.0
您可以尝试add
单个嵌套列而不是change
整个结构。
spark.sql(f"ALTER TABLE {db}.{table} ADD COLUMNS (key1.d string AFTER key1.c)")