我具有以下表的结构:
0: jdbc:hive2://vw118287.ds.dev.accenture.com> desc sample2;
Getting log thread is interrupted, since query is done!
+-------------+------------+----------+--+
| col_name | data_type | comment |
+-------------+------------+----------+--+
| event_text | string | |
| load_date | string | |
+-------------+------------+----------+--+
现在,我想在将数据插入该表的同时添加一个带有常数值的新列,因此我在查询下运行:
insert into table sample2 select event_text,'2019','Sample' as SampleColumn from ccs_service_optimization_9401.so_wireless_radius_summary ;
它抛出以下错误:
Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:18 Cannot insert into target table because column number/types are different 'sample2': Table insclause-0 has 2 columns, but query has 3 columns.
我读到了SO帖子,这在Hive中是不可能的,但我相信必须有某种或其他方式。任何人都对此有任何想法?
在Hive中不可能。表DDL不是动态的,并且DML查询不影响表定义。
如果要添加列,请在插入之前执行alter表DDL。
ALTER TABLE tablename ADD columns SampleColumn string;
另外,如果表是外部的,则可以使用其他列删除并创建它,如果旧数据将保留,如果它是最后一个列,则新列将用于旧数据;
来自 Hive version > 2.0
它支持在插入脚本中指定列名
insert into table sample2 (col_name, data_type, comment ) select event_text,'2019','Sample' as SampleColumn from ccs_service_optimization_9401.so_wireless_radius_summary;
如果您的表格进行了分区,则不支持列规范,您可能需要在插入过程中指定分区列。