针对十分钟的时间间隔运行 SQL 查询



我有一个带有此模式的postgresql表:

id SERIAL PRIMARY KEY,
traveltime INT,
departuredate TIMESTAMPTZ,
departurehour TIMETZ

以下是一些数据(已编辑(:

id | traveltime |     departuredate      | departurehour 
----+------------+------------------------+---------------
1 |         73 | 2019-12-24 00:00:03+01 | 00:00:03+01
2 |         73 | 2019-12-24 00:12:16+01 | 00:12:16+01
53 |        115 | 2019-12-24 07:53:44+01 | 07:53:44+01
54 |        116 | 2019-12-24 07:58:45+01 | 07:58:45+01
55 |        119 | 2019-12-24 08:03:46+01 | 08:03:46+01
56 |        120 | 2019-12-24 08:08:47+01 | 08:08:47+01
57 |        121 | 2019-12-24 08:13:48+01 | 08:13:48+01
58 |        121 | 2019-12-24 08:18:48+01 | 08:18:48+01
542 |        112 | 2019-12-26 07:52:41+01 | 07:52:41+01 
543 |        114 | 2019-12-26 07:57:42+01 | 07:57:42+01
544 |        116 | 2019-12-26 08:02:43+01 | 08:02:43+01
545 |        116 | 2019-12-26 08:07:44+01 | 08:07:44+01
546 |        117 | 2019-12-26 08:12:45+01 | 08:12:45+01
547 |        118 | 2019-12-26 08:17:46+01 | 08:17:46+01
548 |        118 | 2019-12-26 08:22:48+01 | 08:22:48+01
1031 |         80 | 2019-12-28 07:50:33+01 | 07:50:33+01
1032 |         81 | 2019-12-28 07:55:34+01 | 07:55:34+01
1033 |         81 | 2019-12-28 08:00:35+01 | 08:00:35+01
1034 |         82 | 2019-12-28 08:05:36+01 | 08:05:36+01
1035 |         82 | 2019-12-28 08:10:37+01 | 08:10:37+01
1036 |         83 | 2019-12-28 08:15:38+01 | 08:15:38+01
1037 |         83 | 2019-12-28 08:20:39+01 | 08:20:39+01

我想获得几周内每 10 分钟间隔为旅行时间收集的所有值的平均值。

数据样本的预期结果:对于 8h00 和 8h10 之间的 10 分钟间隔,将包含在 avg 中的行带有id 55, 56, 544, 545, 1033 and 1034等等。

我可以得到特定时间间隔的平均值:

select avg(traveltime) from belt where departurehour >= '10:40:00+01' and departurehour < '10:50:00+01';

为了避免为每个间隔创建查询,我使用此查询来获取编码的完整时间段的所有 10 分钟间隔:

select i from generate_series('2019-11-23', '2020-01-18', '10 minutes'::interval) i;

我错过的是一种将我的 AVG 查询应用于这些生成的间隔的方法。任何方向都会有所帮助!

事实证明,generate_series实际上并不适用,因为与日期范围无关。关键部分是每天 144 个 10 分钟的间隔。不幸的是,Postgres没有为minuet提供间隔类型。(也许创建一个将是一个有用的练习(。但一切都不是损失,你可以用BETWEEN模拟相同的,只需要玩范围的结束。
下面使用递归 CTE 生成此模拟。然后像以前一样加入您的表。

set timezone to '+1';    -- necessary to keep my local offset from effecting results. 
-- create table an insert data here 
-- additional data added outside of date range so should not be included) 
with recursive min_intervals as 
(select '00:00:00'::timetz        start_10Min   -- start of 1st 10Min interval
, '00:09:59.999999'::timetz end_10Min     -- last microsecond in 10Min interval
, 1 interval_no
union all 
select start_10Min + interval '10 min'        
, end_10Min   + interval '10 min'  
, interval_no + 1
from Min_intervals
where interval_no < 144                   -- 6 10Min intervals/hr * 24 Hr/day = No of 10Min intervals in any day
)  -- select * from min_intervals;
select start_10Min, end_10Min, avg(traveltime) average_travel_time
from min_intervals
join belt  
on departuredate::time between start_10Min and end_10Min
where departuredate::date between date '2019-11-23' and date '2020-01-18'  
group by start_10Min, end_10Min
order by start_10Min;   
-- test result for 'specified' Note added rows fall within time frame 08:00 to 08:10
-- but these should be excluded so the avg for that period should be the same for both queries.
select avg(traveltime) from belt where id in (55, 56, 544, 545, 1033, 1034); 

我对上述问题的问题是数据范围本质上是硬编码的(是的,替换参数可用(和手动,但这对于 psql 或 IDE 来说是可以的,但对于生产环境来说就不好了。如果要在该环境中使用它,我将使用以下函数返回具有相同结果的虚拟表。

create or replace function travel_average_per_10Min_interval(
start_date_in date
, end_date_in   date
) 
returns table (Start_10Min     timetz
,end_10Min       timetz
,avg_travel_time numeric
)
language sql
as $$
with recursive min_intervals as 
(select '00:00:00'::timetz        start_10Min   -- start of 1st 10Min interval
, '00:09:59.999999'::timetz end_10Min     -- last microsecond in 10Min interval
, 1 interval_no
union all 
select start_10Min + interval '10 min'        
, end_10Min   + interval '10 min'  
, interval_no + 1
from Min_intervals
where interval_no < 144                        -- 6 10Min intervals/hr * 24 Hr/day = No of 10Min intervals in any day
)  -- select * from min_intervals;
select start_10Min, end_10Min, avg(traveltime) average_travel_time
from min_intervals
join belt  
on departuredate::time between start_10Min and end_10Min
where departuredate::date between start_date_in and end_date_in  
group by start_10Min, end_10Min
order by start_10Min;                  
$$;
-- test 
select * from travel_average_per_10Min_interval(date '2019-11-23', date '2020-01-18');

最新更新