我是Hive的新手,请帮助语法.. below是来自表LogStash的2列(filepath,filepath,filepath(bytes((....
/bns/ghi/cod/cob_def/abc | 10600
/sandbox/abc/def/xyz/ade | 1062659
/data/def/cag/tyz/gj/ibs | 457869
/tmp/cdb/def/ghik/new_data/2018-08-17 | 14565
/data/abc/def/ghi/new_data | 56453
我能够将汇总提交到第一个级别,....
类似地,我如何提取第二级,例如:(/data/abc,/bns/ghi,/tmp/cbd(例如;如果/数据为100 GB,我需要知道内部/数据的内容,例如/data/def = 20 gb/data/efg = 20 gb ...以及类似的3级
**select substr(filepathpath, 2, instr(substr(filepath,2), '/')-2) zone,
sum(filesize)from logstash group by substr(filepath, 2, instr(substr(filepath,2), '/')-2)**
@user9314128;请在下面尝试查询:希望它有帮助。谢谢
select filepath
,sum(filesize) as sumfilesize
from logstash
where length(regexp_replace(filepath,'[^/]','')) = 1
group by filepath;
第二级;将Where子句从= 1更改为= 2
使用split和concat_ws
1级
select split(filepath, '/')[0] as 1_level_path,filesize from logstash;
1级别带有文件sum。
select split(filepath, '/')[0] as 1_level_path,SUM(filesize)
from logstash
group by split(filepath, '/')[0] ;
2级
select CONCAT_WS('/',split(filepath, '/')[0],split(filepath, '/')[1]),filesize from logstash;
3级
select CONCAT_WS('/',CONCAT_WS('/',split(filepath, '/')[0],split(filepath, '/')[1]),split(filepath, '/')[2])),filesize from logstash