当偏移量可变时,如何获取"next"事件?



我在Oracle数据库中有一个事务表。我正试图为涉及多个交易类型的交付系统拉一个报告。"请求"类型实际上可以是四个子类型之一(本例中为"A"、"B"、"C"one_answers"D"),"交付"类型可以是四个子类型之一("PULL"、"PICKUP"、"MAIL")。从"请求"到"交付"可以有1到5个事务,并且许多"交付"类型也是中间事务。例子:

Item | Transaction | Timestamp
001  | REQ-A       | 2014-07-31T09:51:32Z
002  | REQ-B       | 2014-07-31T09:55:53Z
003  | REQ-C       | 2014-07-31T10:01:15Z
004  | REQ-D       | 2014-07-31T10:02:29Z
005  | REQ-A       | 2014-07-31T10:05:47Z
002  | PULL        | 2014-07-31T10:20:04Z
002  | MAIL        | 2014-07-31T10:20:06Z
001  | PULL        | 2014-07-31T10:22:21Z
001  | TRANSFER    | 2014-07-31T10:22:23Z
003  | PULL        | 2014-07-31T10:24:10Z
003  | TRANSFER    | 2014-07-31T10:24:12Z
004  | PULL        | 2014-07-31T10:26:28Z
005  | PULL        | 2014-07-31T10:28:42Z
005  | TRANSFER    | 2014-07-31T10:28:44Z
001  | ARRIVE      | 2014-07-31T11:45:01Z
001  | PICKUP      | 2014-07-31T11:45:02Z
003  | ARRIVE      | 2014-07-31T11:47:44Z
003  | PICKUP      | 2014-07-31T11:47:45Z
005  | ARRIVE      | 2014-07-31T11:49:45Z
005  | PICKUP      | 2014-07-31T11:49:46Z

我需要的是这样一份报告:

Item | Start Tx | End Tx | Time
001  | REQ-A    | PICKUP | 1:53:30
002  | REQ-B    | MAIL   | 0:24:13
003  | REQ-C    | PICKUP | 1:46:30
004  | REQ-D    | PULL   | 0:23:59
005  | REQ-A    | PICKUP | 1:43:59

What I have:

Item | Start Tx | End Tx   | Time
001  | REQ-A    | PULL     | 0:30:49
001  | REQ-A    | TRANSFER | 0:30:51
001  | REQ-A    | ARRIVE   | 1:53:29
001  | REQ-A    | PICKUP   | 1:53:30
002  | REQ-B    | PULL     | 0:24:11
002  | REQ-B    | MAIL     | 0:24:13
003  | REQ-C    | PULL     | 0:22:55
003  | REQ-C    | TRANSFER | 0:22:57
003  | REQ-C    | ARRIVE   | 1:46:29
003  | REQ-C    | PICKUP   | 1:46:30
004  | REQ-D    | PULL     | 0:23:59
005  | REQ-A    | PULL     | 0:22:55
005  | REQ-A    | TRANSFER | 0:22:57
005  | REQ-A    | ARRIVE   | 1:43:58
005  | REQ-A    | PICKUP   | 1:43:59

我正在做什么来获取数据:

SELECT Item, Transaction, nextTransaction, nextTimestamp - Timestamp
FROM (
    SELECT Item, Transaction, Timestamp,
      LEAD(Transaction, 5) OVER (PARTITION BY Item ORDER BY Timestamp) AS "nextTransaction"
      LEAD(Timestamp, 5) OVER (PARTITION BY Item ORDER BY Timestamp) AS "nextTimestamp"
    FROM Transactions
    UNION ALL 
    SELECT Item, Transaction, Timestamp,
      LEAD(Transaction, 4) OVER (PARTITION BY Item ORDER BY Timestamp) AS "nextTransaction"
      LEAD(Timestamp, 4) OVER (PARTITION BY Item ORDER BY Timestamp) AS "nextTimestamp"
    FROM Transactions
    UNION ALL 
    SELECT Item, Transaction, Timestamp,
      LEAD(Transaction, 3) OVER (PARTITION BY Item ORDER BY Timestamp) AS "nextTransaction"
      LEAD(Timestamp, 3) OVER (PARTITION BY Item ORDER BY Timestamp) AS "nextTimestamp"
    FROM Transactions
    UNION ALL 
    SELECT Item, Transaction, Timestamp,
      LEAD(Transaction, 2) OVER (PARTITION BY Item ORDER BY Timestamp) AS "nextTransaction"
      LEAD(Timestamp, 2) OVER (PARTITION BY Item ORDER BY Timestamp) AS "nextTimestamp"
    FROM Transactions
    UNION ALL 
    SELECT Item, Transaction, Timestamp,
      LEAD(Transaction, 1) OVER (PARTITION BY Item ORDER BY Timestamp) AS "nextTransaction"
      LEAD(Timestamp, 1) OVER (PARTITION BY Item ORDER BY Timestamp) AS "nextTimestamp"
    FROM Transactions
)
WHERE nextTransaction IS NOT NULL
AND Transaction IN ('REQ-A', 'REQ-B', 'REQ-C', 'REQ-D')

我可以在脚本中手动解析它(也许这实际上是最好的操作过程),但为了学习,我想知道是否有可能实际使用SQL单独完成此操作。

听起来您想要基于时间的第一个和最后一个事务。我认为以下是你想要的:

select item,
       min(transaction) keep (dense_rank first order by timestamp) as StartTx, 
       min(transaction) keep (dense_rank last order by timestamp) as EndTx,
       max(timestamp) - min(timestamp)
from transactions t
group by item;

您可以使用first_value解析函数代替lead:

select item, start_tran, end_tran, end_time - start_time
from (
  select item,
    first_value(transaction) over (partition by item
      order by timestamp) as start_tran,
    first_value(timestamp) over (partition by item
      order by timestamp) as start_time,
    first_value(transaction) over (partition by item
       order by timestamp desc) as end_tran,
    first_value(timestamp) over (partition by item
       order by timestamp desc) as end_time,
    row_number() over (partition by item
       order by timestamp) as rn
  from transactions
)
where rn = 1
order by item;
      ITEM START_TRAN END_TRAN   END_TIME-START_TIME
---------- ---------- ---------- -------------------
         1 REQ-A      PICKUP     0 1:53:30.0         
         2 REQ-B      MAIL       0 0:24:13.0         
         3 REQ-C      PICKUP     0 1:46:30.0         
         4 REQ-D      PULL       0 0:23:59.0         
         5 REQ-A      PICKUP     0 1:43:59.0         

row_number代替distinct。内部查询为原始表中的每一行生成一行,只有item和分析函数结果,因此它们对于每个item都是相同的;item 1有五行相同的行,每一行显示第一个和最后一个transaction以及相应的timestamp。外部查询主要折叠这些,但也执行时间戳减法以获得经过的时间间隔。

SQL Fiddle不能很好地显示间隔,但是如果你想以不同的方式表示,你可以从中提取小时/分钟/秒值。或者如果列是日期那么你可以直接使用to_char()

相关内容

  • 没有找到相关文章

最新更新