使用开始和停止时间对时间序列事务进行分组和汇总



我有一系列交易。我需要总结处理这些事务所花费的总时间,以及当您从一个组移动到另一个组时中断的组内事务计数。这意味着,如果该组有多个时隙的一组事务,则该组可以重复。我正在使用oracle数据库和SQL来做这个总结。我正在寻找一些帮助编写 SQL 来总结这些数据。我已经使用各种分析和分组函数尝试了许多不同的查询。

源数据:

User ID | Area | Start Time | End Time | Transaction ID
jsmith  24  07-JUL-14 09.06.21.000000000 AM 07-JUL-14 09.14.51.000000000 AM 63665020
jsmith  24  07-JUL-14 11.06.01.000000000 AM 07-JUL-14 11.07.10.000000000 AM 63679145
jsmith  24  07-JUL-14 11.16.36.000000000 AM 07-JUL-14 11.16.51.000000000 AM 63680441
jsmith  21  07-JUL-14 11.33.05.000000000 AM 07-JUL-14 11.33.18.000000000 AM 63683133
jsmith  21  07-JUL-14 11.48.55.000000000 AM 07-JUL-14 11.51.03.000000000 AM 63685906
jsmith  21  07-JUL-14 11.55.48.000000000 AM 07-JUL-14 11.56.01.000000000 AM 63686651
jsmith  24  07-JUL-14 12.32.51.000000000 PM 07-JUL-14 12.32.55.000000000 PM 63690943
jsmith  24  07-JUL-14 12.32.55.000000000 PM 07-JUL-14 12.33.02.000000000 PM 63690956
jsmith  21  07-JUL-14 12.42.05.000000000 PM 07-JUL-14 12.42.11.000000000 PM 63692212
jsmith  21  07-JUL-14 12.44.38.000000000 PM 07-JUL-14 12.44.43.000000000 PM 63692600
jsmith  25  07-JUL-14 01.01.18.000000000 PM 07-JUL-14 01.01.52.000000000 PM 63694808
jsmith  25  07-JUL-14 01.03.43.000000000 PM 07-JUL-14 01.04.44.000000000 PM 63695353
jsmith  25  07-JUL-14 01.12.30.000000000 PM 07-JUL-14 01.14.06.000000000 PM 63697002
jsmith  24  07-JUL-14 01.14.11.000000000 PM 07-JUL-14 01.14.23.000000000 PM 63697045
jsmith  24  07-JUL-14 01.21.41.000000000 PM 07-JUL-14 01.23.46.000000000 PM 63699080

所需摘要数据:(模拟(

User ID | Area | Start Time | End Time | Total Minutes | Total Transactions
jsmith    24          9:06am   11:16am     110              3 
jsmith    21         11:33am   11:56am      23              3 
jsmith    24    .....
jsmith    21    .....
jsmith    25    .....
jsmith    24    .....

要解决这些问题,您需要为每行连续生成一些组号。在这里,我首先使用LAG在每次开始一个新组时生成一个刻度线。使用 SUM 的外部查询将计算从第一个到当前刻度线的刻度线数,以生成组号:

SELECT "Area", 
       MIN("Start Time") as "Start Time", 
       MAX("End Time") as "End Time",  
       SUM("End Time" - "Start Time")*24*60 as "Total Minutes", 
       COUNT("Transaction ID") as "#Transaction ID"
FROM (
  SELECT SUM(clk)
         OVER (ORDER BY "Start Time"
               ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) grp,
--       ^^^^^^^^^^^^^
--        Generate a group number by summing the tick marks
         V1.*
  FROM (
    SELECT CASE
             WHEN LAG("Area", 1, NULL) OVER (ORDER BY "Start Time") = "Area"
             THEN 0
             ELSE 1
           END clk,
--         ^^^^^^^^
--         Set a tick mark ("clock") to 1 each time we change group
         T.*
    FROM T
    ) V1
  ) V2
GROUP BY GRP, "Area"
ORDER BY "Start Time"

见 http://sqlfiddle.com/#!4/93f05/2


有点难以掌握,但这也很有效:

SELECT "Area", 
       MIN("Start Time") as "Start Time", 
       MAX("End Time") as "End Time",  
       SUM("End Time" - "Start Time")*60 as "Total Minutes", 
       COUNT("Transaction ID") as "#Transaction ID"
FROM (
  SELECT ROWNUM-ROW_NUMBER() 
                   OVER (PARTITION BY "Area" ORDER BY "Start Time") grp,
       T.*
  FROM T
  ORDER BY "Start Time"
) V
GROUP BY GRP, "Area"
ORDER BY "Start Time"

请参阅 http://sqlfiddle.com/#!4/07d43/6

在这种情况下,请注意您可以有重复的grp - 但仅适用于不同的组。除非行是连续的。有关此内容的讨论,请参阅下面的评论。

相关内容

  • 没有找到相关文章

最新更新