蜂巢查询语法



我是Hive的新手,请帮助语法.. below是来自表LogStash的2列(filepath,filepath,filepath(bytes((....

/bns/ghi/cod/cob_def/abc                        | 10600
/sandbox/abc/def/xyz/ade                        | 1062659
/data/def/cag/tyz/gj/ibs                        | 457869
/tmp/cdb/def/ghik/new_data/2018-08-17           | 14565
/data/abc/def/ghi/new_data                      | 56453

我能够将汇总提交到第一个级别,....

类似地,我如何提取第二级,例如:(/data/abc,/bns/ghi,/tmp/cbd(例如;如果/数据为100 GB,我需要知道内部/数据的内容,例如/data/def = 20 gb/data/efg = 20 gb ...以及类似的3级

**select substr(filepathpath, 2, instr(substr(filepath,2), '/')-2) zone, 
   sum(filesize)from logstash group by substr(filepath, 2, instr(substr(filepath,2), '/')-2)**

@user9314128;请在下面尝试查询:希望它有帮助。谢谢

select filepath
,sum(filesize) as sumfilesize
from logstash 
where length(regexp_replace(filepath,'[^/]','')) = 1
group by filepath;

第二级;将Where子句从= 1更改为= 2

使用split和concat_ws

1级

select split(filepath, '/')[0] as 1_level_path,filesize from logstash;

1级别带有文件sum。

select split(filepath, '/')[0] as 1_level_path,SUM(filesize) 
from logstash
group by split(filepath, '/')[0] ;

2级

select CONCAT_WS('/',split(filepath, '/')[0],split(filepath, '/')[1]),filesize from logstash;

3级

 select CONCAT_WS('/',CONCAT_WS('/',split(filepath, '/')[0],split(filepath, '/')[1]),split(filepath, '/')[2])),filesize from logstash

最新更新