是否有任何 Hive 数组比较函数/udf 的



hive是否有任何数组比较函数/udf来查看array=array ?

例如:

Select a.xyz, b.abc from a left join b on a.C=b.D

其中CD为数组

数组是有序结构,

 concat_ws( <separator> , <array> ) 

这个函数将使用分隔符将所有数组元素连接成一个字符串。

create table arrayDemo( id bigint, list array<String> );
create table dummy( a int );
insert into table dummy values ( 1 ) ; 
insert into arraydemo select 1, array("Paperino", "Topolino") from dummy;
insert into arraydemo select 2, array("Pippo", "Pluto") from dummy;
insert into arraydemo select 1, array("Pippo", "Pluto") from dummy; 
select * from arraydemo; 
+---------------+--------------------------+--+
| arraydemo.id  |      arraydemo.list      |
+---------------+--------------------------+--+
| 1             | ["Paperino","Topolino"]  |
| 2             | ["Pippo","Pluto"]        |
| 1             | ["Pippo","Pluto"]        |
+---------------+--------------------------+--+
select * 
     from arraydemo as a1 
          inner join arraydemo as a2 
          on concat_ws( "|", a1.list ) = concat_ws("|", a2.list ); 
+--------+--------------------------+--------+--------------------------+--+
| a1.id  |         a1.list          | a2.id  |         a2.list          |
+--------+--------------------------+--------+--------------------------+--+
| 1      | ["Paperino","Topolino"]  | 1      | ["Paperino","Topolino"]  |
| 2      | ["Pippo","Pluto"]        | 2      | ["Pippo","Pluto"]        |
| 1      | ["Pippo","Pluto"]        | 2      | ["Pippo","Pluto"]        |
| 2      | ["Pippo","Pluto"]        | 1      | ["Pippo","Pluto"]        |
| 1      | ["Pippo","Pluto"]        | 1      | ["Pippo","Pluto"]        |
+--------+--------------------------+--------+--------------------------+--+

我在这个例子中以"toString()"的方式使用它。有时最好的方法是只比较数组的重要部分

 select * 
      from arraydemo as a1 
      inner join arraydemo as a2 
      on a1.list[0] = a2.list[0]; 

可以通过

比较两个数组的值

array_diff($array1,$array2);

它返回一个数组,其中包含array1中不存在于array2或array3等中的项

您可以使用hive-third-functions。它提供了一些有用的json、数组和映射函数。对于这个问题,您可以这样使用:

Select a.xyz, b.abc from a left join b on array_equals(a.C,b.D)

最新更新