我使用PrestoDB进行了以下SQL查询:
SELECT
date_trunc('month',TimeFrom) AS Month,
date_trunc('hour',TimeFrom) AS HourFrom,
date_trunc('hour',TimeTo) AS HourTo,
CASE
WHEN resource = 'c81d9eb67e93' THEN 'S'
WHEN resource = 'cce1eee4c73f' THEN 'L'
WHEN resource = 'fbc1d63e0f15' THEN 'G'
WHEN resource = '5ef75e433db2' THEN 'Q'
WHEN resource = '4b6ba65cbe14' THEN 'T'
WHEN resource = 'df0a8b60467d' THEN 'R'
WHEN resource = '013472b63651' THEN 'C'
WHEN resource = '7464b77d93be' THEN 'F'
WHEN resource = '6f1186bdbc12' THEN 'M'
ELSE 'other resource'
END AS Resource,
COUNT (*) AS Utilization
FROM (SELECT
resource,
date_add('hour', i - 1, mytable.TimeFrom) AS TimeFrom,
date_add('hour', i, mytable.TimeFrom) AS TimeTo
FROM (SELECT 1 AS i UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4
UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8
UNION ALL SELECT 9 UNION ALL SELECT 10 UNION ALL SELECT 11 UNION ALL SELECT 12
UNION ALL SELECT 13 UNION ALL SELECT 14 UNION ALL SELECT 15 UNION ALL SELECT 16
UNION ALL SELECT 17 UNION ALL SELECT 18 UNION ALL SELECT 19 UNION ALL SELECT 20
UNION ALL SELECT 21 UNION ALL SELECT 22 UNION ALL SELECT 23 UNION ALL SELECT 24
) AS numbers
INNER JOIN (SELECT
resource,
date_trunc(
'hour',
FROM_UNIXTIME(timefrom)
) AS TimeFrom,
date_trunc(
'hour',
(FROM_UNIXTIME(timeto) + interval '45' minute)
) AS TimeTo
FROM
reservation
WHERE
type = 'create'
ORDER BY
date_trunc(
'day',
FROM_UNIXTIME(timefrom)
),
resource) AS mytable ON numbers.i <= date_diff('hour', TimeFrom, TimeTo)
ORDER BY
TimeFrom,
resource)
GROUP BY
date_trunc('month',TimeFrom),
date_trunc('hour',TimeFrom),
date_trunc('hour',TimeTo),
resource
它返回这样的表:
Month | HourFrom | HourTo | Resource | Utilization
["2015-08-01 00:00:00.000","2015-08-05 14:00:00.000","2015-08-05 15:00:00.000","S",4]
["2015-08-01 00:00:00.000","2015-08-05 17:00:00.000","2015-08-05 18:00:00.000","Q",1]
["2015-08-01 00:00:00.000","2015-08-05 17:00:00.000","2015-08-05 18:00:00.000","Q",1]
["2015-08-01 00:00:00.000","2015-08-05 18:00:00.000","2015-08-05 19:00:00.000","S",5]
["2015-08-01 00:00:00.000","2015-08-05 20:00:00.000","2015-08-05 21:00:00.000","Q",1]
["2015-08-01 00:00:00.000","2015-08-05 21:00:00.000","2015-08-05 22:00:00.000","Q",1]
["2015-08-01 00:00:00.000","2015-08-05 22:00:00.000","2015-08-05 23:00:00.000","Q",1]
["2015-08-01 00:00:00.000","2015-08-06 00:00:00.000","2015-08-06 01:00:00.000","L",2]
["2015-08-01 00:00:00.000","2015-08-06 00:00:00.000","2015-08-06 01:00:00.000","Q",2]
["2015-08-01 00:00:00.000","2015-08-06 02:00:00.000","2015-08-06 03:00:00.000","S",3]
如您所见,HourFrom和HourTo列按天划分。有没有办法把它们汇总起来,这样我就可以得到一个月中所有日子每小时的总数?
假设我有一月份的数据;二月,我们只从凌晨1点到4点开放,只有2个资源,那么我想得到的最后一张表是这样的(即,1月1日至1月2日上午一行给我1月这段时间内所有天数的利用率[总数]):
Month | HourFrom | HourTo | Resource | Utilization
Jan Jan 1AM Jan 2AM S number
Jan Jan 1AM Jan 2AM X number
Jan Jan 2AM Jan 3AM S number
Jan Jan 2AM Jan 3AM X number
Jan Jan 3AM Jan 4AM S number
Jan Jan 3AM Jan 4AM X number
Feb Feb 1AM Feb 2AM S number
Feb Feb 1AM Feb 2AM X number
Feb Feb 2AM Feb 3AM S number
Feb Feb 2AM Feb 3AM X number
Feb Feb 3AM Feb 4AM S number
Feb Feb 3AM Feb 4AM X number
我该如何更改我的原始查询以执行这样的操作?
我将从一个更简单的查询开始,该查询使用24小时时钟计算小时数,并且不需要填写缺失的小时数。这将是:
SELECT date_trunc('month', TimeFrom) AS Month,
hour(TimeFrom) as HourFrom,
(CASE WHEN resource = 'c81d9eb67e93' THEN 'S'
WHEN resource = 'cce1eee4c73f' THEN 'L'
WHEN resource = 'fbc1d63e0f15' THEN 'G'
WHEN resource = '5ef75e433db2' THEN 'Q'
WHEN resource = '4b6ba65cbe14' THEN 'T'
WHEN resource = 'df0a8b60467d' THEN 'R'
WHEN resource = '013472b63651' THEN 'C'
WHEN resource = '7464b77d93be' THEN 'F'
WHEN resource = '6f1186bdbc12' THEN 'M'
ELSE 'other resource'
END) AS Resource,
COUNT(*) AS Utilization
FROM reservation r
WHERE type = 'create'
GROUP BY date_trunc('month', TimeFrom), hour(TimeFrom), resource;
也许这个版本足以满足你想要实现的目标。