当配置单元中的列B不为空时，从列A获取上一个值

下面有一个表tableA

ID   number   Estimate   Client    
----   ------
1      3          8         A 
1      NULL       10        Null
1      5          11        A      
1      NULL       19        Null 
2      NULL       20        Null
2      2          70        A   
.......

当number列不为空时，我想选择Estimate列的前一行。例如，当number = 3时，然后是pre_estimate = NULL，当number = 5时，然后才是pre_estimate = 10，并且当number = 2时，然后再是pre_estimate = 20。

下面的查询似乎没有在Hive中返回正确的答案。正确的做法应该是什么？

select lag(Estimate, 1) OVER (partition by ID) as prev_estimate
from tableA
where number is not null

考虑具有以下结构的表：

number - int
estimate - int
order_column - int

order_column被视为要对表行进行排序的列。

表中数据：

number   estimate   order_column
3          8         1 
NULL       10        2
5          11        3      
NULL       19        4 
NULL       20        5
2          70        6

我使用了以下查询，得到了您提到的结果。

SELECT * FROM (SELECT number, estimate, lag(estimate,1) over(order by order_column) as prev_estimate from tableA) tbl where tbl.number is not null;

根据我的理解，我没有找到按id分区的原因，这就是为什么我没有在表中考虑id。

您得到错误结果的原因是，主查询中的where子句只会选择编号为非null的记录，然后它会计算滞后函数，但在计算滞后函数时需要考虑所有行，然后应该选择编号为不null的行。

相关内容

最新更新

热门标签：