优化presto中的窗口查询



我有一个包含user_id、col1、col2、col3、updated_at、is_deleted、day等字段的表。

当前查询如下所示-

SELECT DISTINCT
user_id,
first_value(col1) ignore nulls OVER (partition BY user_id 
ORDER BY
updated_at DESC rows BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED following) AS col1,
first_value(col2) ignore nulls OVER (partition BY user_id 
ORDER BY
updated_at DESC rows BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED following) AS col2,
first_value(col3) ignore nulls OVER (partition BY user_id 
ORDER BY
updated_at DESC rows BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED following) AS col3,
bool_or(is_deleted) ignore nulls OVER (partition BY user_id 
ORDER BY
updated_at DESC rows BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED following) AS is_deleted 
FROM
my_table
WHERE
day >= '2021-05-25'

基本上,我想要每个用户id的每个列的最新(第一个(值。由于每个值列都可以为null,我不得不多次(为每个列(运行相同的窗口查询。目前,66%的时间都花在了开窗上。有什么优化的方法吗?

看起来你想要这个:

select * from (
select * , row_number() over (partition by user_id ORDER BY updated_at DESC) rn 
from my_table
where day >= '2021-05-25'
) t 
where rn = 1

相关内容

  • 没有找到相关文章

最新更新