Hive:如何将字符串转换为数组的数组



我有一个hive列值存储为字符串

[[1,2],[3,4,8],[5,6,7,9]]

我需要找出每个内部数组的长度。我该怎么做呢?

基本上我需要一个查询,汇总每个内部数组的大小。如果这一列被存储为数组的数组,我将执行如下操作

select sum(size(innerArray)) from myTab lateral view explode (mycol) arr as innerArray;

但是现在当我尝试上面的,我得到

FAILED: UDFArgumentException explode() takes an array or a map as a parameter

因为你的初始数组不是真正的数组,它是字符串,你需要解析并分解它:

with mytable as(
select '[[1,2],[3,4,8],[5,6,7,9]]' as mycol
)
select mycol as original_string,
innerArray_str, 
--split inner array and get size
size(split(innerArray_str,',')) inner_array_size
from mytable
--explode upper array
--replace `],` (allow spaces before comma) with `,,,` and remove all `[` and `]`, split using ,,, as a delimiter 
lateral view outer explode(split(regexp_replace(regexp_replace(mycol,'\] *,',',,,'),'\[|\]',''),',,,') )e as innerArray_str 

结果:

original_string             innerarray_str  inner_array_size
[[1,2],[3,4,8],[5,6,7,9]]   1,2             2
[[1,2],[3,4,8],[5,6,7,9]]   3,4,8           3
[[1,2],[3,4,8],[5,6,7,9]]   5,6,7,9         4

现在可以添加sum()group by

最新更新