Hive窗口函数-当前行和无界的后面



无界前行和当前行与无界前行和无界后行有什么区别

假设我在写代码:

from groceries
select id, revenue, day, sum(revenue)
over ( order by id rows between unbounded preceding and current row)

,我正在写另一个:

from groceries
select id, revenue, day, sum(revenue)
over ( order by id rows between unbounded preceding and unbounded following)

在执行代码时会有什么不同?

第一个sum

sum(revenue)
over ( order by id rows between unbounded preceding and current row)

是一个运行和:数据集按id排序,对每一行计算帧[first row...current row]的和。因此,对于每一行,帧是不同的,并且随着下一行的增加而增加。最后一行,对应于最大id,将是total sum(所有行的总和)。

第二个和( order by id rows between unbounded preceding and unbounded following)-是所有行的总和-帧对每一行都是相同的[第一行…]最后一行)。排序在这里没有任何意义,第二个表达式相当于sum(revenue) over (),正如您所看到的,它产生相同的结果:

with mydata as (
select 1 id, 10 revenue union all
select 2 id, 10 revenue union all
select 3 id, 10 revenue union all
select 4 id, 10 revenue union all
select 5 id, 10 revenue
)
select id, revenue, 
sum(revenue) over ( order by id rows between unbounded preceding and current row) sum1,
sum(revenue) over ( order by id rows between unbounded preceding and unbounded following) sum2,
sum(revenue) over() sum3 --this is the same as sum2
from mydata
order by id

结果:

id  revenue sum1    sum2    sum3
1   10      10      50      50
2   10      20      50      50
3   10      30      50      50
4   10      40      50      50
5   10      50      50      50

相关内容

  • 没有找到相关文章

最新更新