请考虑下表:
create table measurement (
datetime timestamp,
temperature numeric(5,2)
);
我想在PostgreSQL
中创建一个SQL
查询,该查询提取温度高于 50 °C 至少 30 分钟的行,理想情况下知道从什么时候到温度实际高于 50 °C。 示例数据如下所示:
datetime temperature
------------------- -----------
2017-03-15 19:00:10 49.56
2017-03-15 19:15:10 52.81
2017-03-15 19:30:10 49.00
2017-03-15 19:45:10 52.88
2017-03-15 20:00:10 49.56
2017-03-15 20:15:10 49.13
2017-03-15 20:30:10 51.31 <--
2017-03-15 20:45:10 52.06 <--
2017-03-15 21:00:10 50.50 <--
2017-03-15 21:15:10 50.50 <--
2017-03-15 21:30:10 49.38
2017-03-15 21:45:10 47.44
2017-03-15 22:00:10 46.19
2017-03-15 22:15:10 45.44
2017-03-15 22:30:10 50.25
2017-03-15 22:45:10 48.56
2017-03-15 23:00:10 51.25 <--
2017-03-15 23:15:10 50.44 <--
2017-03-15 23:30:10 50.63 <--
2017-03-15 23:45:10 46.75
因此,温度高于 50 的第一个身份组。 这是一个缺口和孤岛的问题。 然后,您可以汇总岛屿以获取所需的信息:
select min(datetime), max(datetime), count(*) as numrecs, avg(temperature)
from (select t.*,
row_number() over (order by datetime) as seqnum,
row_number() over (partition by (temperature >= 50)::int
order by datetime) as seqnum_t
from t
) t
where temperature >= 50
group by (seqnum - seqnum_t)
having max(datetime) >= min(datetime) + interval '30' minute;
Gordon的解决方案可以简化为单个OLAP函数:
select min(datetime), max(datetime), count(*) as numrecs, avg(temperature)
from
(
select datetime, temperature,
-- previous time when temperature was < 50
-- same time for all rows with a temp >= 50
max(case when temperature < 50 then datetime end)
over (order by datetime
rows unbounded preceding) as prevlow
from measurement
) as dt
where temperature >= 50
group by prevlow
having max(datetime) >= min(datetime) + interval '30' minute;