我的数据包含每个员工的班次开始时间和结束时间。
user start_time end_time
5238797 08:00 10:00
3919833 08:00 11:30
1642034 08:00 11:30
3818609 08:00 11:30
4903371 09:00 15:00
4786985 09:00 11:00
4513139 09:00 12:00
4452816 09:00 12:00
...
我正在尝试创建一个表格,在其中计算给定日期特定时间间隔的工作员工人数。 例:
Time Count
08:00 4
08:30 4
09:00 8
09:30 8
10:00 7
...
我尝试执行以下操作:
SUM(IF("08:00" >= start_time AND "08:00" < end_time, 1, 0)) AS H8
但是所需间隔的任何更改(例如每 30 分钟到 15 分钟(都需要大量手动复制/粘贴。此外,结果将位于列而不是行中。
有谁知道我该怎么做?我在Google BigQuery中使用标准SQL。
你可以有一个这样的查询:
with data as (
select 5238797 as user, TIME "08:00:00" as start_time, TIME "10:00:00" as end_time
union all
select 3919833 as user, TIME "08:00:00" as start_time, TIME "11:30:00" as end_time
union all
select 4903371 as user, TIME "09:00:00" as start_time, TIME "15:00:00" as end_time
), slots as (
SELECT
num,
time_add(time(8,0,0),INTERVAL 30*num minute ) as slot,
data.*
FROM UNNEST(GENERATE_ARRAY(0,16)) AS num
cross join data
), t as (
select slots.*, if(slot between start_time and end_time,user,null) as works from slots
), t_final as (
select slot,count(distinct works) from t
group by 1
order by 1
)
select * from t_final
返回:
+----------+-----+
| slot | cc |
+----------+-----+
| 08:00:00 | 2 |
+----------+-----+
| 08:30:00 | 2 |
+----------+-----+
| 09:00:00 | 3 |
+----------+-----+
| 09:30:00 | 3 |
+----------+-----+
| 10:00:00 | 3 |
+----------+-----+
| 10:30:00 | 2 |
+----------+-----+
| 11:00:00 | 2 |
+----------+-----+
| 11:30:00 | 2 |
+----------+-----+
| 12:00:00 | 1 |
+----------+-----+
| 12:30:00 | 1 |
+----------+-----+
| 13:00:00 | 1 |
+----------+-----+
| 13:30:00 | 1 |
+----------+-----+
| 14:00:00 | 1 |
+----------+-----+
| 14:30:00 | 1 |
+----------+-----+
| 15:00:00 | 1 |
+----------+-----+
| 15:30:00 | 0 |
+----------+-----+
| 16:00:00 | 0 |
+----------+-----+