配置单元:将两个贴图合并为一列



我有一个配置单元表作为

create table mySource(
col_1   map<string, string>,
col_2   map<string, string>
)

以下是一张唱片可能看起来像的样子

col_1                col_2
{"a":1, "b":"2"}     {"c":3, "d":"4"}

我的目标表看起来像这个

create table myTarget(
my_col   map<string, string>
)

现在,我想将mySource中的两列合并为一个映射,并将其提供给我的目标表。基本上我想写一些类似的东西

insert into myTarget
select
some_method(col_1, col_2) as my_col
from mySource;

hive中有一种内置的方法可以做到这一点吗?我用collect_set做了一些尝试,但出现了很多错误

仅使用内置方法的解决方案。分解两个映射,UNION ALL结果,收集key:value的数组,将数组与','连接,使用str_to_map:将字符串转换为映射

with mytable as (--Use your table instead of this
select 
map('a','1', 'b','2') as col_1, map('c','3', 'd','4') as col_2
)
select str_to_map(concat_ws(',',collect_set(concat(key,':',val)))) as mymap
from
(
select m1.key, m1.val 
from mytable
lateral view explode(col_1) m1 as key, val
union all
select m2.key, m2.val 
from mytable
lateral view explode(col_2) m2 as key, val
)s       
;

结果:

mymap
{"a":"1","b":"2","c":"3","d":"4"}  

有了砖屋图书馆,这会容易得多:

ADD JAR /path/to/jar/brickhouse-0.7.1.jar;
CREATE TEMPORARY FUNCTION COMBINE AS 'brickhouse.udf.collect.CombineUDF';
select combine(col_1, col_2) as mymap from mytable;

相关内容

  • 没有找到相关文章

最新更新