我想取消对日期列的最小值和最大值内所有日期的间隔的嵌套,而不按任何其他列进行分组。
根据这篇文章中的答案,我可以接近,但我想生成所有job_name
值的总日期范围,而不是按每个job_name
分组的日期范围。因此,对于2021-08-20到2021-08-22,每个job_name
应该被分解为具有3行。
WITH
dataset AS (
SELECT *
FROM
( VALUES
('A', DATE '2021-08-21'), ('A', DATE '2021-08-22'),
('B', DATE '2021-08-20'), ('B', DATE '2021-08-21')
) AS d (job_name, run_date)),
nested_dates AS (
select job_name, sequence(min(run_date), max(run_date), interval '1' day) seq
from dataset
group by job_name)
SELECT job_name, dates
FROM nested_dates
CROSS JOIN UNNEST(seq) AS t(dates)
电流输出:
# job_name dates
1 A 2021-08-21 00:00:00.000
2 A 2021-08-22 00:00:00.000
3 B 2021-08-20 00:00:00.000
4 B 2021-08-21 00:00:00.000
期望输出:
# job_name dates
1 A 2021-08-20 00:00:00.000
2 A 2021-08-21 00:00:00.000
3 A 2021-08-22 00:00:00.000
3 B 2021-08-20 00:00:00.000
4 B 2021-08-21 00:00:00.000
5 B 2021-08-22 00:00:00.000
一种方法可以使用windows函数和不同的选择:
-- sample data
WITH dataset(job_name, run_date) AS (
VALUES ('A', DATE '2021-08-21'),
('A', DATE '2021-08-22'),
('B', DATE '2021-08-20'),
('B', DATE '2021-08-21')),
nested_dates AS (
select distinct job_name, max (run_date) over() max_run_date, min (run_date) over() min_run_date
from dataset)
-- query
select job_name, dates
from nested_dates,
unnest (sequence(min_run_date, max_run_date, interval '1' day)) AS t(dates)
order by job_name, dates;
输出:
job_name | 日期 |
---|---|
A | 2021-08-20 |
A | 2021-08-21 |
A | 2021-08-22 |
B | 2021-08-20 |
B | 2021-08-21 |
B | 2021-08-22 |