SQLfiddle : http://sqlfiddle.com/#!17/0ba48b
假设我有一张这样的桌子
CREATE TABLE trips (origin int, destination int, t TIMESTAMP);
INSERT INTO trips VALUES
(1, 1, '2016-06-22 08:02:00'),
(1, 3, '2016-06-22 08:10:00'),
(1, 3, '2016-06-22 08:12:00'),
(2, 1, '2016-06-22 08:49:00');
起点/目的地集{1,2,3}
。我想生成一个这样的表:
origin | destination | t |count
--------------------------------------------
1 | 1 | 08:00:00 | 1
1 | 1 | 08:15:00 | 0
1 | 1 | 08:30:00 | 0
1 | 1 | 08:45:00 | 0
1 | 2 | 08:00:00 | 0
...| | |
1 | 3 | 08:00:00 | 2
1 | 3 | 08:15:00 | 0
基本上,对于每个起点-目的地对,我想生成整个时间序列(间隔 15 分钟(,聚合行程并在需要时插入 0/null。
到目前为止我的尝试:
with
-- generate bins
bins as (
select bin
from
(select generate_series(
date_trunc('hour',
(SELECT min(t)
FROM trips
LIMIT 1)) , -- start at one day ago, rounded to the hour
date_trunc('hour', (SELECT max(t)
FROM trips
LIMIT 1))+ '1 day'::interval, -- stop at now, rounded to the hour
'15 minutes'::interval -- one hour intervals
) as bin ) intervals
where extract('ISODOW' FROM bin) < 6 -- exclude weekends
),
-- bin the od table https://gis.stackexchange.com/a/127874/13965
od as (
SELECT COUNT(*) cnt, origin, destination,
to_timestamp(floor((extract('epoch' from t) / 900 )) * 900) -- this bins it to 15 minutes, but has many missing values
AT TIME ZONE 'UTC' as interval_alias_exit
FROM
trips
GROUP BY interval_alias_exit, origin, destination
)
-- merge the two
select
* from
bins
left join od
on bins.bin = od.interval_alias_exit
where origin = '1'
但这不会为每个 OD 对创建时间序列。例如,当原点为1
时,它只会返回两行。请看小提琴的例子。
您需要将所有点与所有点交叉连接以获得所有起点目的地对,然后将其与时间序列交叉连接。然后,您可以在相等的起点和目的地上加入行程,并且时间戳大于或等于系列的时间戳,但小于系列的时间戳加 15 分钟。然后按起点、目的地和序列的时间戳进行聚合,并计算行程的时间戳。
SELECT o.origin,
d.destination,
gs.t,
count(t.t) count
FROM (SELECT origin
FROM trips
UNION
SELECT destination
FROM trips) o
CROSS JOIN (SELECT destination
FROM trips
UNION
SELECT origin
FROM trips) d
CROSS JOIN generate_series((SELECT date_trunc('hour', min(t))
+ concat(floor(extract('minutes' FROM min(t)) / 15) * 15,
' minutes')::interval
FROM trips),
(SELECT date_trunc('hour', max(t))
+ concat(floor(extract('minutes' FROM max(t)) / 15) * 15,
' minutes')::interval
FROM trips),
'15 minutes'::interval) gs (t)
LEFT JOIN trips t
ON t.origin = o.origin
AND t.destination = d.destination
AND t.t >= gs.t
AND t.t < gs.t + '15 minutes'::interval
GROUP BY o.origin,
d.destination,
gs.t
ORDER BY o.origin,
d.destination,
gs.t;
(对于示例所需结果中的时间序列边界,您似乎有另一种逻辑,而不是您发布的查询。我试图匹配前者。但我想你会想出如何改变它以满足你的实际需求。
SQL 小提琴