我有日期重叠的定价记录。在很少的日子里,有不止一个价格重叠。请遵循以下示例:
例如,在2022年2月15日,有两个价格10和8。
文章 | 价格 | 起始日期结束日期 | |
---|---|---|---|
123 | 10 | 2022年2月2日 | 2049年12月31日 |
123 | 8 | 2022年2月14日2022年9月14日 | |
123 | 5 | 2022年3月14日2022年4月06日 | |
123 | 4 | 2022年4月11日 | 2022月4日27日 |
使用新起始价格窗口的逻辑会赢得重叠。
差异日期版本:
with data(article,price,startdate,enddate) as (
select * FROM VALUES
(123, 10, '2022-02-02'::date, '2049-12-31'::date),
(123, 8, '2022-02-14'::date, '2022-09-14'::date),
(123, 5, '2022-03-14'::date, '2022-04-06'::date),
(123, 4, '2022-04-11'::date, '2022-04-27'::date)
), dis_times as (
select article,
date as startdate,
lead(date) over(partition by article order by date)-1 as enddate
from (
select distinct article, startdate as date from data
union
select distinct article, enddate+1 as date from data
)
qualify enddate is not null
)
select
d1.article,
d1.price,
d2.startdate,
d2.enddate
from data as d1
join dis_times as d2
on d1.article = d2.article
and d2.startdate between d1.startdate and d1.enddate qualify row_number() over (partition by d1.article, s_startdate order by d1.startdate desc) = 1
order by 1,3;
给出:
文章 | 价格s_STARTDATE | s_ENDDATE||
---|---|---|---|
123 | 10 | 2022-02-02 | 2022:02-13 |
123 | 8 | 2022-02-14 | 2022-03-13 |
123 | 5 | 2022-03-14 | 2022-04-06 |
123 | 8 | 2022-04-07 | 2022:04-10 |
123 | 4 | 2022-04-11 | 2022:04-27 |
123 | 8 | 2022-04-28 | 2022:09-14 |
123 | 10 | 2022-09-15 | 2049-12-31 |
我做的第一件事是——我将您的按日期价格范围数据转换为按日期价格查找表。
create or replace temporary table price_date_lookup as
select distinct
article,
dateadd('day',b.index-1,start_date) as dates,
first_value(price) over (partition by article, dates order by end_date) as price
from my_table,
lateral split_to_table(repeat('.',datediff(day,start_date,end_date)), '.') b;
注意:
first_value
通过基于结束日期覆盖价格来处理重叠lateral...
基本上有助于创建一个包含该范围内所有日期的日期列
我一创建该表,就认为可以像处理gaps and island
问题一样处理其余部分。
with cte1 as
(select *, case when lag(price) over (partition by article order by dates)=price then 0 else 1 end as price_start --flag start of a new price island
from price_date_lookup),
cte2 as
(select *, sum(price_start) over (partition by article order by dates) as price_id --assign id to all the price islands
from cte1)
select article,
price,
min(dates) as start_date,
max(dates) as end_date
from cte2
group by article,price,price_id;