当该标识符不再存在于数据库SQL中时,缺少特定标识符的日期而没有添加额外的日期



用语言来表达这个问题,我有一个包含订阅者和每天数据的大表。如果订阅者不再存在,那么他们将没有更多的记录,即SUB123从2021年10月28日起不再存在,那么该订阅者将每天都有记录,直到2021年10月27日。手头的问题是,一些订阅者没有日期,这可能是由于周末或其他问题。我想用空值填充这些记录,这样它们就可以记录了。

当前问题:

<表类> 用户 日期 牧师 tbody><<tr>sub12325/10/2021256sub45625/10/2021282sub12326/10/2021652sub12327/10/2021396sub45628/10/2021132sub45629/10/2021484sub45601/11/202196sub45602/11/202145

您可以使用lag函数生成缺失范围以与unnest平整化,并额外处理Rev:

-- sample data
WITH dataset (Subscriber, Date, Rev) AS (
VALUES ('sub123',   date_parse('25-10-2021', '%d-%m-%Y'),   256),
('sub456',  date_parse('25-10-2021', '%d-%m-%Y'),   282),
('sub123',  date_parse('26-10-2021', '%d-%m-%Y'),   652),
('sub123',  date_parse('27-10-2021', '%d-%m-%Y'),   396),
('sub456',  date_parse('28-10-2021', '%d-%m-%Y'),   132),
('sub456',  date_parse('29-10-2021', '%d-%m-%Y'),   484),
('sub456',  date_parse('01-11-2021', '%d-%m-%Y'),   96),
('sub456',  date_parse('02-11-2021', '%d-%m-%Y'),   45)
) 
-- query
select subscriber, lifted_date as date, if(date = lifted_date, rev, NULL) rev
from
(
select Subscriber, 
Rev,
cast(date as date) date, 
lag(cast(date as date)) over(partition by Subscriber order by date) prev_date
from dataset
)
cross join unnest(
array_except(sequence(coalesce(prev_date, date), date, interval '1' day), array[prev_date])
) as t(lifted_date)
order by subscriber, date

输出:

<表类>用户日期牧师tbody><<tr>sub1232021-10-25 00:00:00.000256sub1232021-10-26 00:00:00.000652sub1232021-10-27 00:00:00.000396sub4562021-10-25 00:00:00.000282sub4562021-10-26 00:00:00.000sub4562021-10-27 00:00:00.000sub4562021-10-28 00:00:00.000132sub4562021-10-29 00:00:00.000484sub4562021-10-30 00:00:00.000sub4562021-10-31 00:00:00.000sub4562021-11-01 00:00:00.00096sub4562021-11-02 00:00:00.00045

最新更新