这就是问题a:我有一张暂存台:
key0 key1 timestamp partition_key
5 5 2020-03-03 14:42:21.548 1
5 4 2020-03-03 14:40:11.871 1
4 3 2020-03-03 14:43:47.602 2
这个目标表:
key0 key1 timestamp partition_key
5 4 2020-03-03 13:43:16.695 1
5 5 2020-03-03 13:45:24.793 1
5 2 2020-03-03 13:47:30.668 1
5 1 2020-03-03 13:48:30.669 1
4 3 2020-03-03 13:53:47.602 2
43 3 2020-03-03 14:00:14.016 2
我想得到这个输出:
key0 key1 timestamp partition_key
5 5 2020-03-03 14:42:21.548 1
5 4 2020-03-03 14:40:11.871 1
5 2 2020-03-03 13:47:30.668 1
5 1 2020-03-03 13:48:30.669 1
4 3 2020-03-03 14:43:47.602 2
43 3 2020-03-03 14:00:14.016 2
在时间戳字段中,我想要key0、key1和partition_key时更新次数最多的记录。此外,我希望目标表中已经存在的记录,但在暂存表中不存在
我首先尝试了这个查询:
select
t1.key0,
t1.key1,
t1.timestamp,
t2.partition_key
from staging_table t2
left outer join target_table t1 on
t1.key0=t2.key0 AND
t1.key1=t2.key1 AND
t1.timestamp=t2.timestamp;
这看起来像是一个优先级查询——从暂存中获取所有内容,然后从目标中获取不匹配的行。我推荐union all
:
select s.*
from staging s
union all
select t.*
from target t left join
staging s
on t.key0 = s.key0 and t.key1 = s.key1
where s.key0 is null;
这确实假设暂存具有最近的行——这在您的示例数据中是正确的。如果没有,我会把它说成:
select key0, key1, timestamp, partition_key
from (select st.*,
row_number() over (partition by key0, key1 order by timestamp desc) as seqnum
from ((select s.* from source s
) union all
(select t.* from target t
)
) st
) st
where seqnum = 1;
您需要FULL JOIN
:
select COALESCE(t1.key0, T2.key0) AS key0, COALESCE(t1.key1, T2.KEY1) AS KEY1,
COALESCE(t1.timestamp, T2.timestamp) AS timestamp,
COALESCE(t1.partition_key, t2.partition_key) AS partition_key
t2.partition_key
from staging_table t2 FULL JOIN
target_table t1
on t1.key0 = t2.key0 AND t1.key1 = t2.key1 AND
t1.timestamp = t2.timestamp;
我认为您只需要一个left join
和coalesce()
:
select
t.key0,
t.key1,
coalesce(s.timestamp, t.timestamp) timestamp,
t.partition_key
from target_table t
left join staging_table s
on s.key0 = t2.key0
and s.key1 = t.key1
and s.partition_key = t.partition_key
对于target_table
中的每个记录,这将在staging_table
中搜索具有相同(key0, key1, partition_key
的记录(。如果这样的记录可用,我们使用它的timestamp
来代替target_table
中的timestamp
。