当给定多个开始时间时,如何创建开始和停止时间?



我正在尝试量化活动与空闲时间,而我需要做的第一件事就是创建独特的离散开始和结束时间。问题是数据库(我被告知这是一个错误(为事件创建多个"启动"时间。为了使其更加复杂,"报告"可以具有多个工作实例,并且每个报告都应以离散持续时间记录。

例如,

WorkflowID ReportID User Action Timestamp      
1           1         A  Start   1:00      
2           1         A  Stop    1:03      
3           1         B  Start   1:05      
4           1         B  Start   1:06      
5           1         B  Stop    1:08      
6           1         B  Start   1:10      
7           1         B  Start   1:11      
8           1         B  Stop    1:14

我想编写一个SQL查询,该查询将输出以下内容:

User StartTime EndTime
A     1:00      1:03
B     1:05      1:08
B     1:10      1:14

我遇到的问题是开始/停止事件的数量必须是任意的(每个用户的报告ID(。此外,需要删除该系列中第一个"启动"与以下"停止"之间的多余"启动"时间,以免弄乱它。

也许我错过了一些东西,但这对我来说很棘手。有什么想法吗?谢谢。

重复使用lag(),将用户的先前操作与当前的动作进行比较。如果它们是相同的,那是重复的,请标记。然后,使用row_number()编号开始和停止,以便每对开始和停止共享一个数字(每个报告和用户(。然后加入报告,用户和该号码。

为方便起见,您可以使用CTE来构建查询并防止重复某些子查询的必要性。

WITH
[DeduplicatedAndNumbered]
AS
(
SELECT [WorkflowID],
       [ReportID],
       [User],
       [Action],
       [Timestamp],
       row_number() OVER (PARTITION BY [ReportID],
                                       [User],
                                       [Action]
                          ORDER BY [Timestamp]) [Number]
       FROM (SELECT [WorkflowID],
                    [ReportID],
                    [User],
                    [Action],
                    [Timestamp],
                    CASE
                      WHEN lag([Action]) OVER (PARTITION BY [ReportId],
                                                            [User]
                                               ORDER BY [Timestamp]) = [Action] THEN
                        1
                      ELSE
                        0
                   END [IsDuplicate]
                   FROM [elbaT]) [x]
       WHERE [IsDuplicate] = 0
),
[DeduplicatedAndNumberedStart]
AS
(SELECT [WorkflowID],
        [ReportID],
        [User],
        [Action],
        [Timestamp],
        [Number]
        FROM [DeduplicatedAndNumbered]
        WHERE [Action] = 'Start'),
[DeduplicatedAndNumberedStop]
AS
(SELECT [WorkflowID],
        [ReportID],
        [User],
        [Action],
        [Timestamp],
        [Number]
        FROM [DeduplicatedAndNumbered]
        WHERE [Action] = 'Stop')
SELECT [DeduplicatedAndNumberedStart].[User],
       [DeduplicatedAndNumberedStart].[Timestamp] [StartTime],
       [DeduplicatedAndNumberedStop].[Timestamp] [EndTime]
       FROM [DeduplicatedAndNumberedStart]
            INNER JOIN [DeduplicatedAndNumberedStop]
                       ON [DeduplicatedAndNumberedStart].[ReportId] = [DeduplicatedAndNumberedStop].[ReportId]
                          AND [DeduplicatedAndNumberedStart].[User] = [DeduplicatedAndNumberedStop].[User]
                          AND [DeduplicatedAndNumberedStart].[Number] = [DeduplicatedAndNumberedStop].[Number];

db<>小提琴

op用 sql-server-2008标记了他们的问题。

由于SQL Server 2008缺少lag()函数(已在SQL Server 2012中添加(,因此这里是一种使用常见的表expressions和row_number((的解决方案,可从SQL Server 2005中获得...

;with [StopEvents] as (
    select  [WorkflowID],
        [ReportID],
        [User],
        [EndTime] = [Timestamp],
        [StopEventSeq] = row_number() over (
            partition by [ReportID], [User], [Timestamp]
            order by [Timestamp])
    from Workflow
    where [Action] = 'Stop'
)
select  this.[User], [StartTime], this.[EndTime]
from [StopEvents] this
-- Left join here because first Stop event won't have a previous Stop event
left join [StopEvents] previous
    on previous.[ReportID] = this.[ReportID]
    and previous.[User] = this.[User]
    and previous.[StopEventSeq] = this.[StopEventSeq] - 1
outer apply (
    select  [StartTime] = min([Timestamp])
    from    Workflow W
    where   W.[ReportID] = this.[ReportID]
    and W.[User] = this.[User]
    and W.[Timestamp] < this.[EndTime]
    -- First Stop event won't have a previous, so just get the min([Timestamp])
    and (previous.[EndTime] is null or W.[Timestamp] >= previous.[EndTime])
) thisStart
order by this.[User], this.[EndTime]

最新更新