"Hive"来自多个列的最大列值

  • 本文关键字:quot Hive sql hadoop hive hiveql
  • 更新时间 :
  • 英文 :


嗨:我遇到一种情况,我需要在 3 个计算字段上找到最大值并将其存储在另一个字段中,是否可以在一个 SQL 查询中执行此操作?下面是示例

SELECT Income1 ,
Income1 * 2%  as Personal_Income ,
Income2 ,
Income2 * 10% as Share_Income ,
Income3 ,
Income3 * 1%  as Job_Income , 
Max(Personal_Income, Share_Income, Job_Income ) 
From Table

我尝试的一种方法是在第一遍和第二遍中计算Personal_Income, Share_Income, Job_Income

Select 
Case when Personal_income > Share_Income and Personal_Income > Job_Income 
then Personal_income 
when Share_income > Job_Income 
then Share_income 
Else Job_income as the greatest_income

但这需要我在十亿行表上进行 2 次扫描,我怎样才能避免这种情况并一次性完成?任何帮助非常感谢。

从Hive 1.1.0开始,您可以使用greatest()函数。此查询将在单个表扫描中执行:

select Income1 ,
Personal_Income ,
Income2 ,
Share_Income ,
Income3 ,
Job_Income ,
greatest(Personal_Income, Share_Income, Job_Income ) as greatest_income
from
(
SELECT Income1 ,
Income1 * 2%  as Personal_Income ,
Income2 ,
Income2 * 10% as Share_Income ,
Income3 ,
Income3 * 1%  as Job_Income , 
From Table
)s
;

最新更新