在Hive中,我有一个表格,如下所示:
id | tier
---------
1 | silver
2 | silver
3 | gold
4 | platinum
1 | gold
3 | platinum
等级按银、金、铂金等顺序递增。我想从表中执行一个 SELECT DISTINCT ,这样,如果一个 id 出现多次,则具有最高层的行是所选的行。
所以最终的表格应该看起来像这样
id | tier
---------
2 | silver
4 | platinum
1 | gold
3 | platinum
如何编写这样的查询?谢谢!
WITH CTE AS
(SELECT id , tier ,ROW_NUMBER() OVER (PARTITION BY tier ORDER BY id desc) from tbalename as rno)
SELECT id , tier from CTE where rno=1;
链接 ROW_NUMBER()https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics
链接 CTEhttps://blogs.oracle.com/taylor22/entry/hive_0_11_may_15