在 Hive SQL 中选择上一个值



我已经戳了一段时间了,但运气不好。我有一张如下所示的表格。我正在尝试在每个用户的第一个文本值之前获取 col 值。我在蜂巢上。

user    ts            col      isnumber
1       1473811200    5         y
1       1473811205    10        y
1       1473811207    15        y
1       1473811212    text1     n
1       1473811215    text2     n
1       1473811225    30        y
2       1473811201    10        y
2       1473811205    text3     n
2       1473811207    20        y
2       1473811210    30        y

输出应为:

user    col
1       15
2       10

使用窗口函数:

SELECT user_, prev
FROM (SELECT *, ROW_NUMBER() OVER(PARTITION BY user_ ORDER BY ts) AS rn
FROM (SELECT *, CASE 
WHEN isnumber = 'y' THEN NULL
WHEN LAG(isnumber,1) OVER(PARTITION BY user_ ORDER BY ts) = 'y'
THEN LAG(col,1) OVER(PARTITION BY user_ ORDER BY ts)
END AS prev
FROM tab) sub
WHERE prev IS NOT NULL) sub2
WHERE rn = 1;

DBFiddle 演示

最新更新