在配置单元中使用子字符串和lpad/rpad替换数字



Hello Everyone,

Kinldy帮助我在Hive中实现Below逻辑。我在蜂箱里有两张桌子(表1,表2(。我需要用一些条件将表1的One列中的0替换为9,并且需要生成一个输出列,然后我将从表2列(High column(加入同一个"输出"列,并从两个表中生成输出。

table1
SCHEME_1       LOW_1
01            12340000
01            12345000
01            12300000
table2
SCHEME            HIGH
01               12349999
01               12345999
01               12399999

从表1中,我需要生成具有以下条件的新列输出。

scenario:
1. I need to check 0's from right to left and as soon as i find 0 before any digit then i need to replace all the trailing 0's by 9.
2. In output at-least 4  digit should be there before 9.
3. In Input if 4 or less digits are available in input then I need to skip some 0's and make sure that at-least 4 digits are there before 9.
4. If more than 4 digits are available before trailing 0's then only need to replace 0,No need to replace digits.
LOW_1        output
12340000     12349999
12345000     12345999
12300000     12309999

然后需要将这个"输出"列与table2 HIGH连接,然后想要生成数据。

expected output
SCHEME            LOW_1      output      HIGH
01              12340000      12349999   12349999
01              12345000      12345999   12345999

我正在使用下面的查询,并希望编写输出列逻辑,以便它可以和表2连接。

with table1 as
(
select LOW_1,SCHEME_1 from table1 where SCHEME_1='01'
)
select table2.*,
SCHEME_1,
LOW_1
from table2 inner join table1
on
(
table2.high=output
);

下面是我想在abobe查询中添加的输出列的逻辑。

with table1 as (
select LOW_1 
) select LOW_1,
lpad(concat(splitted[0], translate(splitted[1],'0','9')),8,0) as output 
from ( 
select LOW_1, split(regexp_replace(LOW_1,'(\d{*?}?)(0+)$','$1|$2'),'\|') splitted from table1 )s;

Can Someone Suggest me to Implement the same.

它现在起作用了。我更新了我的查询如下。

with table1 as
(
select LOW_1,SCHEME_1,
lpad(
concat(
split(regexp_replace('(\d{4,}?)(0+)$','$1|$2'),'\|')[0], 
translate(split(regexp_replace(LOW_1,'(\d{4,}?)(0+)$','$1|$2'),'\|')[1],'0','9')),8,0 ) 
as output
from table1 where SCHEME_1='01'
)
select table2.*,
SCHEME_1,
LOW_1
from table2 inner join table1
on
(
table2.high=output
);

相关内容

  • 没有找到相关文章