我有三列,即时间段(时间戳),超时(时间戳)和员工。我需要获取在特定时间范围内工作的员工人数(间隔30分钟)。例如:
employee_id timein timeout
101 10:10 12:59
102 9:07 12:16
103 11:16 12:08
我需要一个可以给我这个结果的查询
timeframe count(employee_id)
09:00 1
09:30 1
10:00 2
10:30 2
11:00 3
11:30 3
12:00 3
12:30 1
我真的希望我能清楚地表明。谢谢
请参阅此演示:http://sqlfiddle.com/#!17/2477f/1
SELECT x.timeframe, count(employee_id)
FROM (
select time '8:00' + x * interval '30 minute' as timeframe,
time '8:00' + (x+1) * interval '30 minute' as timeframe_end
from generate_series(0,10) x
) x
LEFT JOIN employee t
/* (StartA <= EndB) and (EndA >= StartB) */
ON x.timeframe <= t.timeout
AND x.timeframe_end >= t.timein
GROUP BY x.timeframe
ORDER BY 1
SELECT x.timeframe, count(employee_id)
FROM (
select time '8:00' + x * interval '30 minute' as timeframe,
time '8:00' + (x+1) * interval '30 minute' as timeframe_end
from generate_series(0,12) x
) x
LEFT JOIN employee t
/* (StartA < EndB) and (EndA > StartB) */
ON x.timeframe < t.timeout
AND x.timeframe_end > t.timein
GROUP BY x.timeframe
ORDER BY 1
| timeframe | count |
|-----------|-------|
| 08:00:00 | 0 |
| 08:30:00 | 0 |
| 09:00:00 | 1 |
| 09:30:00 | 1 |
| 10:00:00 | 2 |
| 10:30:00 | 2 |
| 11:00:00 | 3 |
| 11:30:00 | 3 |
| 12:00:00 | 3 |
| 12:30:00 | 1 |
| 13:00:00 | 1 |
| 13:30:00 | 1 |
| 14:00:00 | 0 |
联接条件使用此答案中的公式来检查两个范围是否重叠:
(starta&lt; endb)和(enda> startb)
演示还显示了查询对边缘情况的行为:
(113, '13:00', '13:01'),
(115, '13:30', '14:00')
后者的雇员从13:30开始,并于14:00完成,因此它包含在13:30
时限中,中,但不包括在14:00
TimeFrame中。
| 13:00:00 | 1 |
| 13:30:00 | 1 |
| 14:00:00 | 0 |
问题可能是在同一时间范围内多次开始和完成工作的雇员(例如,频繁咖啡休息的工人),例如:
(113, '13:00', '13:01'),
(113, '13:12', '13:15'),
(113, '13:22', '13:26')
在这种情况下,您需要计算不同的员工,使用:count(DISTINCT employee_id)
尝试这样的东西。
SELECT timeframe,
COUNT (employee_id)
FROM employee a
RIGHT JOIN
(SELECT *
FROM generate_series (TIMESTAMP '2017-09-01 09:00:00',
TIMESTAMP '2017-09-01 17:00:00',
INTERVAL '0.5 HOUR' ) AS timeframe) b
ON b.timeframe >= timein
AND b.timeframe <= timeout
GROUP BY timeframe
ORDER BY timeframe ;
SELECT out_time-in_time time_frame, count(*) FROM
TABLE_NAME GROUP BY out_time-in_time
我针对示例本地数据进行了测试。
employee_id | in_time | out_time
-------------+----------+----------
101 | 09:07:00 | 12:08:00
102 | 10:07:00 | 17:08:00
103 | 12:07:00 | 17:08:00
104 | 12:07:00 | 17:08:00
105 | 10:07:00 | 17:08:00
从查询输出。
time_frame | count
------------+-------
07:01:00 | 2
03:01:00 | 1
05:01:00 | 2
您可以在找到差异时相应地包括逻辑。
sql小提琴
PostgreSQL 9.6架构设置:
CREATE TABLE emp_time
("employee_id" int, "timein" time, "timeout" time)
;
INSERT INTO emp_time
("employee_id", "timein", "timeout")
VALUES
(101, '10:10', '12:59'),
(102, '9:07', '12:16'),
(103, '11:16', '12:08')
;
查询1 :
SELECT
slot_start
, slot_end
, count(employee_id)
FROM (
SELECT slot_start, slot_start + INTERVAL '30 MINUTE' slot_end
FROM generate_series (TIMESTAMP '2017-01-01 09:00:00', TIMESTAMP '2017-01-01 16:30:00', INTERVAL '30 MINUTE' ) AS slot_start
) t
LEFT JOIN emp_time et ON et.timein < t.slot_end::time and et.timeout > t.slot_start::time
GROUP BY
slot_start
, slot_end
ORDER BY
slot_start
, slot_end
;
结果:
| slot_start | slot_end | count |
|----------------------|----------------------|-------|
| 2017-01-01T09:00:00Z | 2017-01-01T09:30:00Z | 1 |
| 2017-01-01T09:30:00Z | 2017-01-01T10:00:00Z | 1 |
| 2017-01-01T10:00:00Z | 2017-01-01T10:30:00Z | 2 |
| 2017-01-01T10:30:00Z | 2017-01-01T11:00:00Z | 2 |
| 2017-01-01T11:00:00Z | 2017-01-01T11:30:00Z | 3 |
| 2017-01-01T11:30:00Z | 2017-01-01T12:00:00Z | 3 |
| 2017-01-01T12:00:00Z | 2017-01-01T12:30:00Z | 3 |
| 2017-01-01T12:30:00Z | 2017-01-01T13:00:00Z | 1 |
| 2017-01-01T13:00:00Z | 2017-01-01T13:30:00Z | 0 |
| 2017-01-01T13:30:00Z | 2017-01-01T14:00:00Z | 0 |
| 2017-01-01T14:00:00Z | 2017-01-01T14:30:00Z | 0 |
| 2017-01-01T14:30:00Z | 2017-01-01T15:00:00Z | 0 |
| 2017-01-01T15:00:00Z | 2017-01-01T15:30:00Z | 0 |
| 2017-01-01T15:30:00Z | 2017-01-01T16:00:00Z | 0 |
| 2017-01-01T16:00:00Z | 2017-01-01T16:30:00Z | 0 |
| 2017-01-01T16:30:00Z | 2017-01-01T17:00:00Z | 0 |