我试图使用Hive中的外部爆炸来获取非现有数据,但我的查询没有返回任何东西。编辑 :桌子 - 年弦,CompanyRank
year:string,topcompanies:array<struct<name:string,rank:string>>
样本数据编辑:
2015,
"topcompanies":[
{"name":"apple","rank":"1"},
{"name":"samsung","rank":"2"},
{"name":"SONY","rank":"3"},
]
2016,
"topcompanies":[
{"name":"apple","rank":"1"},
{"name":"samsung","rank":"2"},
{"name":"SONY","rank":"3"},
{"name":"LG","rank":"4"}
]
查询获取数据
select year, rank1, rank2, rank3, rank4
FROM companyrank
LATERAL VIEW outer explode(topcompanies) rank1_t as rank1_v
LATERAL VIEW outer explode(topcompanies) rank2_t as rank2_v
LATERAL VIEW outer explode(topcompanies) rank3_t as rank3_v
LATERAL VIEW outer explode(topcompanies) rank4_t as rank4_v
WHERE
(rank1_v.rank = 1 or rank1_v.rank is null)
AND (rank2_v.rank = 2 or rank2_v.rank is null)
AND (rank3_v.rank = 3 or rank3_v.rank is null)
AND (rank4_v.rank = 4 or rank4_v.rank is null)
预期输出 -
expected output when rank4 does not exists
year rank1 rank2 rank3 rank4
2016 apple samsung SONY null
如果存在rank4数据,则
year rank1 rank2 rank3 rank4
2015 apple samsung SONY LG
编辑:
我需要每年获得所有4个等级,如果任何等级都不存在,则等级应显示为null。
您问题的简单答案是"使用lateral view outer
",但是有一个更干净的解决方案。
select min (case when i.rank = 1 then i.name end) as rank1
,min (case when i.rank = 2 then i.name end) as rank2
,min (case when i.rank = 3 then i.name end) as rank3
,min (case when i.rank = 4 then i.name end) as rank4
from companyrank c
lateral view inline(topcompanies) i
;
+--------+----------+--------+--------+
| rank1 | rank2 | rank3 | rank4 |
+--------+----------+--------+--------+
| apple | samsung | SONY | NULL |
+--------+----------+--------+--------+