只跟踪SQL工作日连续缺勤天数



我试图创建一个表,记录员工生病的日期,并创建一个新列来提供;疾病ID";,其将识别在若干日期内缺席的唯一实例。我已经做到了,但我现在需要考虑一个表格,其中包含每个员工的工作模式,这将让我知道是否有人在一周中的某一天上班。

这可以使用两个表中的day_no列和employee_number来连接。

我早些时候发布了这个问题,@GMB提供了一个很好的解决方案,但我需要增加工作时间。

我有一个叫sickness的表,看起来像这个

date_sick   day_no  day_name    employee_number hours_lost  working_hours   
2020-07-14  2       Tuesday     001             7.5         7.5             
2020-07-15  3       Wednesday   001             7.5         7.5             
2020-07-16  4       Thursday    001             7.5         7.5             
2020-07-17  5       Friday      001             7.5         7.5             
2020-07-21  2       Tuesday     001             7.5         7.5             
2020-07-22  3       Wednesday   001             7.5         7.5             
2020-07-23  4       Thursday    001             7.5         7.5             
2020-07-24  5       Friday      001             7.5         7.5             
2020-07-28  2       Tuesday     001             7.5         7.5             
2020-07-29  3       Wednesday   001             7.5         7.5             
2020-07-30  4       Thursday    001             7.5         7.5             
2020-07-31  5       Friday      001             7.5         7.5             
2020-09-09  3       Wednesday   001             7.5         7.5             
2020-09-10  4       Thursday    001             7.5         7.5             
2020-07-22  3       Wednesday   002             8           8               
2020-07-23  4       Thursday    002             8           8              

我的工作时间表是这样的:

employee_number day_no working_hours
001             1      0
001             2      7.5
001             3      7.5
001             4      7.5
001             5      7.5
001             6      0
001             7      0
002             1      8
002             2      8
002             3      8
002             4      8
002             5      8
002             6      0
002             7      0

使用以下语句,我可以应用一个唯一的疾病ID,该ID标识连续日期内员工缺勤的唯一实例,该实例对员工和缺勤日期都是唯一的,由给出

IF OBJECT_ID('dbo.sickness ', 'u') IS NOT NULL DROP TABLE dbo.sickness 
CREATE TABLE dbo.sickness (date_sick date
, day_no int
, day_name varchar(10)
, employee_number char(5)
, hours_lost float
, working_hours float)
INSERT INTO dbo.sickness (date_sick, day_no, day_name, Employee_Number, Hours_Lost, Working_Hours)
VALUES 
('2020-07-14', '2', 'Tuesday', '001', '7.5', '7.5'),
('2020-07-15', '3', 'Wednesday', '001', '7.5', '7.5'),
('2020-07-16', '4', 'Thursday', '001', '7.5', '7.5'),
('2020-07-17', '5', 'Friday', '001', '7.5', '7.5'),
('2020-07-21', '2', 'Tuesday', '001', '7.5', '7.5'),
('2020-07-22', '3', 'Wednesday', '001', '7.5', '7.5'),
('2020-07-23', '4', 'Thursday', '001', '7.5', '7.5'),
('2020-07-24', '5', 'Friday', '001', '7.5', '7.5'),
('2020-07-28', '2', 'Tuesday', '001', '7.5', '7.5'),
('2020-07-29', '3', 'Wednesday', '001', '7.5', '7.5'),
('2020-07-30', '4', 'Thursday', '001', '7.5', '7.5'),
('2020-07-31', '5', 'Friday', '001', '7.5', '7.5'),
('2020-09-09', '3', 'Wednesday', '001', '7.5', '7.5'),
('2020-09-10', '4', 'Thursday', '001', '7.5', '7.5'),
('2020-07-22', '3', 'Wednesday', '002', '8', '8'),
('2020-07-23', '4', 'Thursday', '002', '8', '8')
GO
IF OBJECT_ID('dbo.working_hours ', 'u') IS NOT NULL DROP TABLE dbo.working_hours 
CREATE TABLE dbo.working_hours (employee_number char(5)
, day_no int
, working_hours float)
INSERT INTO dbo.working_hours (employee_number, day_no, working_hours)
VALUES 
('001', '1', '0'),
('001', '2', '7.5'),
('001', '3', '7.5'),
('001', '4', '7.5'),
('001', '5', '7.5'),
('001', '6', '0'),
('001', '7', '0'),
('002', '1', '8'),
('002', '2', '8'),
('002', '3', '8'),
('002', '4', '8'),
('002', '5', '8'),
('002', '6', '0'),
('002', '7', '0');

WITH CTE AS(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY employee_number ORDER BY date_sick) AS rn
FROM dbo.sickness s)
SELECT c.date_sick,
c.day_no,
c.day_name,
c.employee_number,
c.hours_lost,
w.working_hours,
DENSE_RANK() OVER (ORDER BY C.employee_number, DATEADD(DAY, -C.rn, C.date_sick)) AS sickness_id
FROM CTE C
JOIN working_hours w
ON  c.employee_number = w.employee_number
AND c.day_no = w.day_no
ORDER BY C.employee_number,
C.date_sick
DROP TABLE dbo.sickness
DROP TABLE dbo.working_hours

这会输出下表:

date_sick   day_no  day_name    employee_number hours_lost  working_hours   sickness_id
2020-07-14  2       Tuesday     001             7.5         7.5             1
2020-07-15  3       Wednesday   001             7.5         7.5             1
2020-07-16  4       Thursday    001             7.5         7.5             1
2020-07-17  5       Friday      001             7.5         7.5             1
2020-07-21  2       Tuesday     001             7.5         7.5             2
2020-07-22  3       Wednesday   001             7.5         7.5             2
2020-07-23  4       Thursday    001             7.5         7.5             2
2020-07-24  5       Friday      001             7.5         7.5             2
2020-07-28  2       Tuesday     001             7.5         7.5             3
2020-07-29  3       Wednesday   001             7.5         7.5             3
2020-07-30  4       Thursday    001             7.5         7.5             3
2020-07-31  5       Friday      001             7.5         7.5             3
2020-09-09  3       Wednesday   001             7.5         7.5             4
2020-09-10  4       Thursday    001             7.5         7.5             4
2020-07-22  3       Wednesday   002             8           8               5
2020-07-23  4       Thursday    002             8           8               5

问题是,它对连续的几天进行分组,但只对同一周内的几天。前12行应该都有相同的疾病ID。我想要的是下表:

date_sick   day_no  day_name    employee_number hours_lost  working_hours   sickness_id
2020-07-14  2       Tuesday     001             7.5         7.5             1
2020-07-15  3       Wednesday   001             7.5         7.5             1
2020-07-16  4       Thursday    001             7.5         7.5             1
2020-07-17  5       Friday      001             7.5         7.5             1
2020-07-21  2       Tuesday     001             7.5         7.5             1
2020-07-22  3       Wednesday   001             7.5         7.5             1
2020-07-23  4       Thursday    001             7.5         7.5             1
2020-07-24  5       Friday      001             7.5         7.5             1
2020-07-28  2       Tuesday     001             7.5         7.5             1
2020-07-29  3       Wednesday   001             7.5         7.5             1
2020-07-30  4       Thursday    001             7.5         7.5             1
2020-07-31  5       Friday      001             7.5         7.5             1
2020-09-09  3       Wednesday   001             7.5         7.5             2
2020-09-10  4       Thursday    001             7.5         7.5             2
2020-07-22  3       Wednesday   002             8           8               3
2020-07-23  4       Thursday    002             8           8               3

有什么想法吗?也许把它连接到日历表上?

正如我在评论中提到的,只需使用WHERE。当然,由于缺乏样本数据(样本没有工作时间数据(,这是一个盲目的猜测:

--I prefer CTEs over subqueries
WITH CTE AS(
SELECT s.date_sick,
s.employee_number,
ROW_NUMBER() OVER (PARTITION BY employee_number ORDER BY date_sick) AS rn
FROM dbo.sickness s)
SELECT C.date_sick,
C.employee_number,
DENSE_RANK() OVER (ORDER BY C.employee_number, DATEADD(DAY, -C.rn, C.date_sick)) AS sickness_id,
wh.workinghours
FROM CTE C
JOIN dbo.workinghours wh ON C.employee_number = wh.employee_number
WHERE wh.working_hours > 0
ORDER BY C.employee_number,
C.date_sick;

我认为使用lag()来查看患病天数是否连续,然后使用累积和是分配疾病id的更好方法。

我有点不清楚你到底想要什么。但有一种方法:

select date_sick, employee_number,
sum(case when working_hours > 0 and prev_working_hours > 0 and
dateadd(day, -1, date_sick) = prev_date_sick
then 0 else 1
end) over (partition by employee_number order by date_sick) as sickness_id
from (select s.*,
lag(date_sick) over (partition by employee_number order by date_sick) as prev_date_sick,
lag(working_hours) over (partition by employee_number order by date_sick) as prev_working_hours
from sickness s left join
working_hours wh
on s.date_sick = wh.working_hours
) s
order by employee_number, date_sick

最新更新