雪花服务器,按小时分配持续时间



i在表中具有以下结构中的数据,其中所有id s都是唯一的,并且有许多不同的启动和结合组合,在启动和结束之间的差异范围从小于小于一个最多只有几百分钟。我只需要将持续时间分开以使开始/结束差异大于60分钟。

| ID     | DURATION_START          | DURATION_END            |
|--------|-------------------------|-------------------------|
| 0abc23 | 2019-06-29 00:08:00.000 | 2019-06-29 09:18:00.000 |

,我想创建一个表格,将持续时间按小时划分为原始条目的ID,如下:

| ID     | DURATION_START          | DURATION_END            |
|--------|-------------------------|-------------------------|
| 0abc23 | 2019-06-29 00:08:00.000 | 2019-06-29 01:07:59.000 |
| 0abc23 | 2019-06-29 01:08:00.000 | 2019-06-29 02:07:59.000 |
| 0abc23 | 2019-06-29 02:08:00.000 | 2019-06-29 03:07:59.000 |
| 0abc23 | 2019-06-29 03:08:00.000 | 2019-06-29 04:07:59.000 |
| 0abc23 | 2019-06-29 04:08:00.000 | 2019-06-29 05:18:00.000 |

我尝试了其他数据库引擎的类似问题中的以下代码:

with table1 as (
      select id
          , duration_start as fromdate
          , duration_end as todate
      from test_data
),
tab1 as (
      select id
        , fromdate
        , dateadd(second, 60*60 - 1, fromdate) as todate1 // add an hour minus a second
        , todate
      from table1
      union all
      select id
        , dateadd(minutes, 60, todate1) as fromdate // add an hour to todate1
        , dateadd(second, 2*60*60-1, todate1) as todate1 // add 1 hours to todate1
        , todate
      from tab1 where date_trunc(hour, todate1) < date_trunc(hour, todate)
),
tab2 as (
      select id
        , fromdate
        , case when todate1 > todate then todate else todate1 end as todate
      from tab1

但这给了我

| ID     | DURATION_START          | DURATION_END            |
|--------|-------------------------|-------------------------|
| 0abc23 | 2019-06-29 00:08:00.000 | 2019-06-29 01:07:59.000 |
| 0abc23 | 2019-06-29 02:07:59.000 | 2019-06-29 03:07:58.000 |
| 0abc23 | 2019-06-29 04:07:58.000 | 2019-06-29 05:07:57.000 |
| 0abc23 | 2019-06-29 06:07:57.000 | 2019-06-29 07:07:56.000 |
| 0abc23 | 2019-06-29 08:07:56.000 | 2019-06-29 09:07:55.000 |

我尝试调整添加的时间,但在开始和结束之间无法获得1小时的偏移。我还尝试使用connect bylevel(建议使用类似的问题,但在Oracle上建议(,但无法将其编译。我只会获得LEVEL的无效标识符错误。

select dateadd(hour,24,duration_start)
      , greatest(duration_start, date_trunc(hour,dateadd(hour,((level-1)::int/24)::int,duration_start)))
      , least(duration_start, date_trunc(hour,dateadd(hour,((level)::int/24)::int,duration_start)))
from test_data
connect by prior LEVEL = floor(datediff(hour, duration_start, duration_end)::int*24)+1;

即使只是尝试使用connect by也会给我一个错误:

select *
from test_data
connect by prior LEVEL = floor(datediff(hour, duration_start, duration_end)::int*24)+1;

错误: SQL compilation error: error line 0 at position -1 invalid identifier 'HOUR'

如果有人可以告诉我如何调整我的雪花方法,这将不胜感激,谢谢!

对于此类项目,它有助于拥有单独的日期&amp;间隔表。

这是实现您需要的脚本。

创建一个带有您需要的日期范围的日历表。

我从'1970-01-01'开始的unix时期,因为我在这种情况下保留了一个日历表。

create or replace table calendar(calendar_date DATE)
insert into calendar(calendar_date)
select
dateadd(d,rid2,to_timestamp_ntz('1970-01-01')) as calendar_date
from
(
select 0 as rid2 union
select row_number() over (order by null) as rid2
from table (generator(rowcount => 100000))
) t
where dateadd(d,rid2,to_timestamp_ntz('1970-01-01')) < '2030-01-01'
order by 1 asc;

接下来创建一个时间/间隔表。间隔表。在您的示例中,您的间隔持续时间为1小时。因此创建了以下内容。

create or replace table interval(id integer, interval_start time);
insert into interval (id,interval_start)
select
id,
to_time(dateadd(hour,id,to_timestamp_ntz('1970-01-01')))
from
(
select 0 as id union
select row_number() over (order by null) as id
from table (generator(rowcount => 23))
) t;

接下来,我创建了一个带有您的示例数据的表,以及其他几个值,以便可以在不同方案中验证计算。

create or replace table example1(id varchar(10), DURATION_START datetime, DURATION_END datetime);
-- drop table example1
truncate table example1;
--
insert into  example1 values('0abc23','2019-06-29 00:08:00.000','2019-06-29 09:18:00.000');
insert into  example1 values('0abc24','2019-06-28 11:07:45.000','2019-06-28  12:08:45.000');
insert into  example1 values('0abc25','2019-06-28 01:00:00.000','2019-06-29 02:15:00.000');
insert into  example1 values('0abc26','2019-06-28 00:08:00.000','2019-06-29 15:18:00.000');

鉴于所有内容都是设置,以下查询将为您提供所需的结果。

select 
      f.id
     ,f.DURATION_START
     ,f.DURATION_END
     ,f.start_time_HOUR_START
     ,f.end_time_HOUR_START
     ,q.CALENDAR_DATE
     ,q.HOUR_START
     ,q.HOUR_END
     ,case
    -- when starts during interval and ends after interval
           when f.DURATION_START >= q.HOUR_START and f.DURATION_END >= dateadd(hour, 1, q.HOUR_START)
               then datediff(s, f.DURATION_START, dateadd(hour, 1, q.HOUR_START))
    -- when starts during interval and ends during interval
           when f.DURATION_START >= q.HOUR_START and f.DURATION_END <= dateadd(hour, 1, q.HOUR_START)
               then datediff(s, f.DURATION_START, f.DURATION_END)
    -- when starts before interval and ends during interval
           when f.DURATION_START <= q.HOUR_START and f.DURATION_END <= dateadd(hour, 1, q.HOUR_START)
               then datediff(s, q.HOUR_START, f.DURATION_END)
    -- entire interval , starts before, and ends after
           when
               f.DURATION_START <= q.HOUR_START and f.DURATION_END >= dateadd(hour, 1, q.HOUR_START)
               then datediff(s, q.HOUR_START, dateadd(hour, 1, q.HOUR_START))
           else 0 end as seconds_elapsed
from (
         select *
              , to_timestamp(
                 dateadd(s, datediff(s, '1970-01-01', DURATION_START) - (datediff(s, '1970-01-01', DURATION_START) % 3600),
                         '1970-01-01')) as start_time_HOUR_START
              , to_timestamp(
                 dateadd(s, datediff(s, '1970-01-01', DURATION_END) - (datediff(s, '1970-01-01', DURATION_END) % 3600),
                         '1970-01-01')) as end_time_HOUR_START
         from example1
     ) f
         inner join
     (
         select
                distinct
                q1.calendar_date
              -- , t2.rid2
              , dateadd(hour, t2.id, to_timestamp(q1.calendar_date)) as HOUR_START
              , dateadd(hour, t2.id + 1, to_timestamp(q1.calendar_date)) as HOUR_END
         from (
                  select calendar_date
                  from calendar
                  where calendar_date between (select to_date(min(DURATION_START)) from example1) and (select to_date(max(DURATION_END)) from example1)
              ) q1
                  cross join
              interval as t2
        -- order by HOUR_START
     ) q on q.HOUR_START between f.start_time_HOUR_START and f.end_time_HOUR_START
ORDER BY f.id
       , f.DURATION_START
       , f.DURATION_END
       , q.CALENDAR_DATE
       , q.HOUR_START
;

以下样本输出。运行脚本以获取最终结果:

+--------+-------------------------+-------------------------+-------------------------+-------------------------+---------------+-------------------------+-------------------------+-----------------+
| ID     | DURATION_START          | DURATION_END            | START_TIME_HOUR_START   | END_TIME_HOUR_START     | CALENDAR_DATE | HOUR_START              | HOUR_END                | SECONDS_ELAPSED |
|--------+-------------------------+-------------------------+-------------------------+-------------------------+---------------+-------------------------+-------------------------+-----------------|
| 0abc23 | 2019-06-29 00:08:00.000 | 2019-06-29 09:18:00.000 | 2019-06-29 00:00:00.000 | 2019-06-29 09:00:00.000 | 2019-06-29    | 2019-06-29 00:00:00.000 | 2019-06-29 01:00:00.000 |            3120 |
| 0abc23 | 2019-06-29 00:08:00.000 | 2019-06-29 09:18:00.000 | 2019-06-29 00:00:00.000 | 2019-06-29 09:00:00.000 | 2019-06-29    | 2019-06-29 01:00:00.000 | 2019-06-29 02:00:00.000 |            3600 |
| 0abc23 | 2019-06-29 00:08:00.000 | 2019-06-29 09:18:00.000 | 2019-06-29 00:00:00.000 | 2019-06-29 09:00:00.000 | 2019-06-29    | 2019-06-29 02:00:00.000 | 2019-06-29 03:00:00.000 |            3600 |
| 0abc23 | 2019-06-29 00:08:00.000 | 2019-06-29 09:18:00.000 | 2019-06-29 00:00:00.000 | 2019-06-29 09:00:00.000 | 2019-06-29    | 2019-06-29 03:00:00.000 | 2019-06-29 04:00:00.000 |            3600 |
| 0abc23 | 2019-06-29 00:08:00.000 | 2019-06-29 09:18:00.000 | 2019-06-29 00:00:00.000 | 2019-06-29 09:00:00.000 | 2019-06-29    | 2019-06-29 04:00:00.000 | 2019-06-29 05:00:00.000 |            3600 |
| 0abc23 | 2019-06-29 00:08:00.000 | 2019-06-29 09:18:00.000 | 2019-06-29 00:00:00.000 | 2019-06-29 09:00:00.000 | 2019-06-29    | 2019-06-29 05:00:00.000 | 2019-06-29 06:00:00.000 |            3600 |
| 0abc23 | 2019-06-29 00:08:00.000 | 2019-06-29 09:18:00.000 | 2019-06-29 00:00:00.000 | 2019-06-29 09:00:00.000 | 2019-06-29    | 2019-06-29 06:00:00.000 | 2019-06-29 07:00:00.000 |            3600 |
| 0abc23 | 2019-06-29 00:08:00.000 | 2019-06-29 09:18:00.000 | 2019-06-29 00:00:00.000 | 2019-06-29 09:00:00.000 | 2019-06-29    | 2019-06-29 07:00:00.000 | 2019-06-29 08:00:00.000 |            3600 |
| 0abc23 | 2019-06-29 00:08:00.000 | 2019-06-29 09:18:00.000 | 2019-06-29 00:00:00.000 | 2019-06-29 09:00:00.000 | 2019-06-29    | 2019-06-29 08:00:00.000 | 2019-06-29 09:00:00.000 |            3600 |
| 0abc23 | 2019-06-29 00:08:00.000 | 2019-06-29 09:18:00.000 | 2019-06-29 00:00:00.000 | 2019-06-29 09:00:00.000 | 2019-06-29    | 2019-06-29 09:00:00.000 | 2019-06-29 10:00:00.000 |            1080 |
| 0abc24 | 2019-06-28 11:07:45.000 | 2019-06-28 12:08:45.000 | 2019-06-28 11:00:00.000 | 2019-06-28 12:00:00.000 | 2019-06-28    | 2019-06-28 11:00:00.000 | 2019-06-28 12:00:00.000 |            3135 |
| 0abc24 | 2019-06-28 11:07:45.000 | 2019-06-28 12:08:45.000 | 2019-06-28 11:00:00.000 | 2019-06-28 12:00:00.000 | 2019-06-28    | 2019-06-28 12:00:00.000 | 2019-06-28 13:00:00.000 |             525 |

有关代码的链接,请单击

可以用各种技巧来完成此操作,但迄今为止,IMO是为此创建一个简短的JavaScript UDTF。

完整的示例,还有一些其他行以下面的测试。请注意,可能需要另外涵盖某些角案件,例如如果您的结束时间是在开始时间之前,或者其中一个是无效的。但这应该有所帮助。

create or replace table x(
  id varchar, 
  duration_start timestamp_ntz,
  duration_end timestamp_ntz
) as 
select * from values
  ('0abc23', '2019-06-29 00:08:00.000', '2019-06-29 09:18:00.000'),
  ('id_2__', '2002-02-02 02:00:00.000', '2002-02-02 02:00:00.000'),
  ('id_3__', '2003-03-03 03:00:00.000', '2003-03-03 04:00:00.000'),
  ('id_4__', '2004-04-04 04:00:00.000', '2004-04-04 04:59:59.000');
create or replace function magic(
  id varchar, 
  duration_start timestamp_ntz,
  duration_end timestamp_ntz
) 
returns table (
  hour_start timestamp_ntz, 
  hour_end timestamp_ntz
) language javascript as 
$$
{
  processRow: function(row, rowWriter, context) {
    let msSecond = 1000;
    let msHour = 60 * 60 * msSecond;
    let msStart = row.DURATION_START.getTime();
    let msEnd = row.DURATION_END.getTime();
    while (msStart <= msEnd) {
      let curEnd = Math.min(msEnd, msStart + msHour - msSecond);
      rowWriter.writeRow({
        HOUR_START: new Date(msStart),
        HOUR_END: new Date(curEnd)
      });
      msStart = curEnd + msSecond;
    }
  }
}
$$;
select * from x, table(magic(id, duration_start, duration_end)) ;
--------+-------------------------+-------------------------+-------------------------+-------------------------+
   ID   |     DURATION_START      |      DURATION_END       |       HOUR_START        |        HOUR_END         |
--------+-------------------------+-------------------------+-------------------------+-------------------------+
 0abc23 | 2019-06-29 00:08:00.000 | 2019-06-29 09:18:00.000 | 2019-06-29 00:08:00.000 | 2019-06-29 01:07:59.000 |
 0abc23 | 2019-06-29 00:08:00.000 | 2019-06-29 09:18:00.000 | 2019-06-29 01:08:00.000 | 2019-06-29 02:07:59.000 |
 0abc23 | 2019-06-29 00:08:00.000 | 2019-06-29 09:18:00.000 | 2019-06-29 02:08:00.000 | 2019-06-29 03:07:59.000 |
 0abc23 | 2019-06-29 00:08:00.000 | 2019-06-29 09:18:00.000 | 2019-06-29 03:08:00.000 | 2019-06-29 04:07:59.000 |
 0abc23 | 2019-06-29 00:08:00.000 | 2019-06-29 09:18:00.000 | 2019-06-29 04:08:00.000 | 2019-06-29 05:07:59.000 |
 0abc23 | 2019-06-29 00:08:00.000 | 2019-06-29 09:18:00.000 | 2019-06-29 05:08:00.000 | 2019-06-29 06:07:59.000 |
 0abc23 | 2019-06-29 00:08:00.000 | 2019-06-29 09:18:00.000 | 2019-06-29 06:08:00.000 | 2019-06-29 07:07:59.000 |
 0abc23 | 2019-06-29 00:08:00.000 | 2019-06-29 09:18:00.000 | 2019-06-29 07:08:00.000 | 2019-06-29 08:07:59.000 |
 0abc23 | 2019-06-29 00:08:00.000 | 2019-06-29 09:18:00.000 | 2019-06-29 08:08:00.000 | 2019-06-29 09:07:59.000 |
 0abc23 | 2019-06-29 00:08:00.000 | 2019-06-29 09:18:00.000 | 2019-06-29 09:08:00.000 | 2019-06-29 09:18:00.000 |
 id_2__ | 2002-02-02 02:00:00.000 | 2002-02-02 02:00:00.000 | 2002-02-02 02:00:00.000 | 2002-02-02 02:00:00.000 |
 id_3__ | 2003-03-03 03:00:00.000 | 2003-03-03 04:00:00.000 | 2003-03-03 03:00:00.000 | 2003-03-03 03:59:59.000 |
 id_3__ | 2003-03-03 03:00:00.000 | 2003-03-03 04:00:00.000 | 2003-03-03 04:00:00.000 | 2003-03-03 04:00:00.000 |
 id_4__ | 2004-04-04 04:00:00.000 | 2004-04-04 04:59:59.000 | 2004-04-04 04:00:00.000 | 2004-04-04 04:59:59.000 |
--------+-------------------------+-------------------------+-------------------------+-------------------------+

请注意,结果包括原始开始/结束,但不必。

相关内容

  • 没有找到相关文章

最新更新