任何人都可以帮忙吗?我们的考勤系统生成以下数据:
User Department Date Time Reader
A1 IT 1/3/2014 11:12:00 1
B1 IT 1/3/2014 12:28:06 1
B1 IT 1/3/2014 12:28:07 1
A1 IT 1/3/2014 13:12:00 2
B1 IT 1/3/2014 13:28:06 2
A1 IT 2/3/2014 07:42:15 1
A1 IT 2/3/2014 16:16:15 2
其中读者价值,
- 1 = 条目
- 2 = 退出
我正在寻找在MS SQL 2005上运行的SQL查询,例如每月汇总每个员工的出勤时间
User Department Month Time
A1 IT 3/2014 10.34
B1 IT 3/2014 01:00
用SQL解决的相当困难的问题,因为需要在数据中找到转换和范围,这并非易事。 我将问题分解为一系列由连续 cte 组成的步骤,这些步骤相互构建并导致最终的工作解决方案:
首先,我向数据添加行索引,以提供用于标识唯一行的简单PK:
with NumberedAtt as (
select
row_number() over (partition by [user] order by date, time, reader) as ix,
att.[user],
att.[department],
att.[date] + att.[time] as dt,
att.[reader]
from att
)
然后,我获取每个用户的第一个和最后一个索引值,该值将用于每个进入/退出范围的最外层边界:
, MinMax as (
select [user], min(ix) ixMin, max(ix) ixMax
from NumberedAtt N group by [user]
)
接下来,我将这些放在一起以生成所有退出/进入范围的列表,这些范围是Reader
的值从2
变为1
的点。 这些是准确标识上一个时间范围何时结束以及下一个时间范围何时开始(并干净地处理连续重复进入/退出读取)的特定点。 通过将其与每个用户的第一个进入和最后一个退出时间相结合,将生成所有进入/退出转换的列表:
, Transitions as (
select N.[User], 0 as exitIx, M.ixMin as entryIx
from NumberedAtt N
join MinMax M on N.[User] = M.[User]
where N.ix = M.ixMin
union
select N.[User], M.ixMax as exitIx, 0 as entryIx
from NumberedAtt N
join MinMax M on N.[User] = M.[User]
where N.ix = M.ixMax
union
select A1.[User], A1.ix as exitIx, A2.ix as entryIx
from NumberedAtt A1
join NumberedAtt A2 on A1.ix + 1 = A2.ix and A1.[user] = A2.[user]
where A1.[reader] = 2 and A2.[reader] = 1
)
这是此时的输出:
| USER | EXITIX | ENTRYIX |
|------|--------|---------|
| A1 | 0 | 1 |
| A1 | 2 | 3 |
| A1 | 4 | 0 |
| B1 | 0 | 1 |
| B1 | 3 | 0 |
请注意,我们已经整齐地捕获了时间范围开始和结束的所有行索引。 但是,它们被偏移 - 即一行中的入场时间对应于下一行的退出时间。 因此,我们需要再进行一次转换,通过向此表添加行索引并将每一行与以下行连接起来,将范围重新组合在一起:
, NumberedTransitions as (
select
row_number() over (partition by [User] order by exitIx) tix,
T.*
from Transitions T
), EntryExit as (
select
aEntry.ix as ixEntry,
aExit.ix as ixExit,
aEntry.[user],
aEntry.[department],
aEntry.[dt] as entryDT,
aExit.[dt] as exitDT
from NumberedTransitions tEntry
join NumberedAtt aEntry on tEntry.entryIx = aEntry.ix and tEntry.[user] = aEntry.[user]
join NumberedTransitions tExit on tEntry.tix + 1 = tExit.tix and tEntry.[user] = tExit.[user]
join NumberedAtt aExit on tExit.exitIx = aExit.ix and tExit.[user] = aExit.[user]
)
将连续范围连接在一起后,我还会将原始详细数据拉回去,因为到目前为止我只处理行索引值。 在此子查询结束时,我们标识了每个用户的所有进入/退出范围,并"吞噬"了任何多次读取:
| IXENTRY | IXEXIT | USER | DEPARTMENT | ENTRYDT | EXITDT |
|---------|--------|------|------------|------------------------------|------------------------------|
| 1 | 2 | A1 | IT | March, 01 2014 11:12:00+0000 | March, 01 2014 13:12:00+0000 |
| 3 | 4 | A1 | IT | March, 02 2014 07:42:15+0000 | March, 02 2014 16:16:15+0000 |
| 1 | 3 | B1 | IT | March, 01 2014 12:28:06+0000 | March, 01 2014 13:28:06+0000 |
现在唯一要做的就是将数据分组在一起,以报告每个用户每月的总小时数。 计算总小时数有点棘手,但可以通过在范围之间取分钟数的总和,然后将结果转换回时间值来完成:
, Hours as (
select
[User],
[Department],
Year(EntryDT) Year,
Month(EntryDT) Month,
RIGHT('0' + CAST(SUM(DATEDIFF(Minute, EntryDT, ExitDT)) / 60 as varchar(10)), 2) + ':' +
RIGHT('0' + CAST(SUM(DATEDIFF(Minute, EntryDT, ExitDT)) % 60 as varchar(2)), 2) as TotalHours
from EntryExit EE
group by [User], [Department], Year(EntryDT), Month(EntryDT)
)
这给出了非常接近所需结果的最终结果:
| USER | DEPARTMENT | YEAR | MONTH | TOTALHOURS |
|------|------------|------|-------|------------|
| A1 | IT | 2014 | 3 | 10:34:00 |
| B1 | IT | 2014 | 3 | 01:00:00 |
可以根据需要对格式进行一些调整,但这应该很容易在此框架之上构建。
这是一个工作演示:http://www.sqlfiddle.com/#!3/f3f37/7