我正在尝试量化活动与空闲时间,而我需要做的第一件事就是创建独特的离散开始和结束时间。问题是数据库(我被告知这是一个错误(为事件创建多个"启动"时间。为了使其更加复杂,"报告"可以具有多个工作实例,并且每个报告都应以离散持续时间记录。
例如,
WorkflowID ReportID User Action Timestamp
1 1 A Start 1:00
2 1 A Stop 1:03
3 1 B Start 1:05
4 1 B Start 1:06
5 1 B Stop 1:08
6 1 B Start 1:10
7 1 B Start 1:11
8 1 B Stop 1:14
我想编写一个SQL查询,该查询将输出以下内容:
User StartTime EndTime
A 1:00 1:03
B 1:05 1:08
B 1:10 1:14
我遇到的问题是开始/停止事件的数量必须是任意的(每个用户的报告ID(。此外,需要删除该系列中第一个"启动"与以下"停止"之间的多余"启动"时间,以免弄乱它。
也许我错过了一些东西,但这对我来说很棘手。有什么想法吗?谢谢。
重复使用lag()
,将用户的先前操作与当前的动作进行比较。如果它们是相同的,那是重复的,请标记。然后,使用row_number()
编号开始和停止,以便每对开始和停止共享一个数字(每个报告和用户(。然后加入报告,用户和该号码。
为方便起见,您可以使用CTE来构建查询并防止重复某些子查询的必要性。
WITH
[DeduplicatedAndNumbered]
AS
(
SELECT [WorkflowID],
[ReportID],
[User],
[Action],
[Timestamp],
row_number() OVER (PARTITION BY [ReportID],
[User],
[Action]
ORDER BY [Timestamp]) [Number]
FROM (SELECT [WorkflowID],
[ReportID],
[User],
[Action],
[Timestamp],
CASE
WHEN lag([Action]) OVER (PARTITION BY [ReportId],
[User]
ORDER BY [Timestamp]) = [Action] THEN
1
ELSE
0
END [IsDuplicate]
FROM [elbaT]) [x]
WHERE [IsDuplicate] = 0
),
[DeduplicatedAndNumberedStart]
AS
(SELECT [WorkflowID],
[ReportID],
[User],
[Action],
[Timestamp],
[Number]
FROM [DeduplicatedAndNumbered]
WHERE [Action] = 'Start'),
[DeduplicatedAndNumberedStop]
AS
(SELECT [WorkflowID],
[ReportID],
[User],
[Action],
[Timestamp],
[Number]
FROM [DeduplicatedAndNumbered]
WHERE [Action] = 'Stop')
SELECT [DeduplicatedAndNumberedStart].[User],
[DeduplicatedAndNumberedStart].[Timestamp] [StartTime],
[DeduplicatedAndNumberedStop].[Timestamp] [EndTime]
FROM [DeduplicatedAndNumberedStart]
INNER JOIN [DeduplicatedAndNumberedStop]
ON [DeduplicatedAndNumberedStart].[ReportId] = [DeduplicatedAndNumberedStop].[ReportId]
AND [DeduplicatedAndNumberedStart].[User] = [DeduplicatedAndNumberedStop].[User]
AND [DeduplicatedAndNumberedStart].[Number] = [DeduplicatedAndNumberedStop].[Number];
db<>小提琴
op用 sql-server-2008
标记了他们的问题。
由于SQL Server 2008缺少lag()
函数(已在SQL Server 2012中添加(,因此这里是一种使用常见的表expressions和row_number((的解决方案,可从SQL Server 2005中获得...
;with [StopEvents] as (
select [WorkflowID],
[ReportID],
[User],
[EndTime] = [Timestamp],
[StopEventSeq] = row_number() over (
partition by [ReportID], [User], [Timestamp]
order by [Timestamp])
from Workflow
where [Action] = 'Stop'
)
select this.[User], [StartTime], this.[EndTime]
from [StopEvents] this
-- Left join here because first Stop event won't have a previous Stop event
left join [StopEvents] previous
on previous.[ReportID] = this.[ReportID]
and previous.[User] = this.[User]
and previous.[StopEventSeq] = this.[StopEventSeq] - 1
outer apply (
select [StartTime] = min([Timestamp])
from Workflow W
where W.[ReportID] = this.[ReportID]
and W.[User] = this.[User]
and W.[Timestamp] < this.[EndTime]
-- First Stop event won't have a previous, so just get the min([Timestamp])
and (previous.[EndTime] is null or W.[Timestamp] >= previous.[EndTime])
) thisStart
order by this.[User], this.[EndTime]