我正在尝试将json转换为数据帧,创建临时表并执行一些查询。但是,我得到了org.apache.hadoop.hive.serde2.SerDeException,因为json有超过7个嵌套级别。我尝试将该属性设置为 true hiveContext.sql("hive.serialization.extend.nesting.levels","true")
但仍然遇到同样的问题。我正在使用火花 1.6.1 版本。任何解决此问题的帮助都将有所帮助。
添加日志
ERROR log: error in initSerDe: org.apache.hadoop.hive.serde2.SerDeException Number of levels of nesting supported for LazySimpleSerde is 7 Unable to work with level 9. Use hive.serialization.extend.nesting.levels serde property for tables using LazySimpleSerde.
org.apache.hadoop.hive.serde2.SerDeException: Number of levels of nesting supported for LazySimpleSerde is 7 Unable to work with level 9. Use hive.serialization.extend.nesting.levels serde property for tables using LazySimpleSerde.
谢谢
如果外部表定义如下:
create external table t1
(
a int,
b double,
c array<struct<
k1:struct<
p1:struct<
r1:struct<
h1:struct<
s1:array<struct<
j1:struct<
x1:int
>
>>
>
>
>
>
>>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES ( "mapping.time_stamp" = "timestamp" )
LOCATION '/user/user1/staging/data/populationdata'
;
假设数据包含超过 7 的嵌套级别。
然后在下一步中,将表展平为,
create table t1
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
WITH SERDEPROPERTIES ( 'hive.serialization.extend.nesting.levels'='true' )
as
select
a,
b,
c1.k1
from
t1
lateral view explode(c) subview as c1
;