为日本一家公司解决问题。政府有一些规定,比如…如果你持工作签证:
- 如果不休息30天,您在公司的工作时间不能超过3年
- 你不能在同一家人事公司工作超过5年而不休假6个月
因此,我们想弄清楚在未来30/60/90天内是否有人会违反这两条规则。
样本数据(合同清单(:
if object_id('tempdb..#sampleDates') is not null drop table #sampleDates
create table #sampleDates (UserId int, CompanyID int, WorkPeriodStart datetime, WorkPeriodEnd datetime)
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27809, 972, '2019-10-10', '2020-10-10')
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27853, 484, '2019-10-10', '2020-10-10')
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27856, 172, '2019-10-10', '2020-10-10')
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27857, 1234, '2015-01-01', '2015-12-31')
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27857, 1234, '2016-01-01', '2017-02-28')
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27857, 1234, '2017-01-01', '2017-12-31')
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27857, 1234, '2018-01-01', '2018-12-31')
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27857, 1234, '2019-01-01', '2020-01-31')
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27857, 1234, '2020-01-01', '2020-12-31')
insert #sampleDates (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values (27897, 179, '2019-10-10', '2020-10-10')
我的第一个问题可能是日期重叠。我已经接近这个问题的解决方案了,但在我知道如何解决X年/Y天工作假的问题之前,我不确定我的cte或临时表的输出应该是什么样子。
我不希望有人为我做这项工作,但我想找到一篇可以告诉我的文章:
- 如何确定某人在这段时间内是否休息过,休息了多长时间(日期范围之间的间隔(
- 我如何计算他们在接下来的30/60/90天里是否已经工作了3/5年而没有休息30/180天
在我开始对程序进行编码之前,这似乎很简单。
感谢您提前提供的帮助。
编辑:
值得一提的是,这是我第二次尝试消除重叠日期(第一个版本使用了dense_rank方法,它一直有效,直到我搞砸了一些事情,用了一些简单的方法(:
;with CJ as (
select UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd from #sampleDates c
)
select
c.CompanyID,
c.WorkPeriodStart,
min(t1.WorkPeriodEnd) as EndDate
from CJ c
inner join CJ t1 on c.WorkPeriodStart <= t1.WorkPeriodEnd and c.UserId = t1.UserId and c.CompanyID = t1.CompanyID
and not exists(select * from CJ t2 where t1.UserId = t2.UserId and t1.CompanyID = t2.CompanyID and t1.WorkPeriodEnd >= t2.WorkPeriodStart AND t1.WorkPeriodEnd < t2.WorkPeriodEnd)
where not exists(select * from CJ c2 where c.UserId = c2.UserId and c.CompanyID = c2.CompanyID and c.WorkPeriodStart > c2.WorkPeriodStart AND c.WorkPeriodStart <= c2.WorkPeriodEnd)
group by c.UserId, c.CompanyID, c.WorkPeriodStart
order by c.UserId, c.WorkPeriodStart
免责声明:这是一个不完整的答案
我稍后可以继续,但这显示了如何计算岛屿。那么识别罪犯就不应该那么复杂了。
请参见增强示例。我添加了用户27897
,它有三个岛:0、1和2。见下文:
create table t (UserId int, CompanyID int, WorkPeriodStart date, WorkPeriodEnd date);
insert t (UserId, CompanyID, WorkPeriodStart, WorkPeriodEnd) values
(27809, 972, '2019-10-10', '2020-10-10'),
(27853, 484, '2019-10-10', '2020-10-10'),
(27856, 172, '2019-10-10', '2020-10-10'),
(27857, 1234, '2015-01-01', '2015-12-31'),
(27857, 1234, '2016-01-01', '2017-02-28'),
(27857, 1234, '2017-01-01', '2017-12-31'),
(27857, 1234, '2018-01-01', '2018-12-31'),
(27857, 1234, '2019-01-01', '2020-01-31'),
(27857, 1234, '2020-01-01', '2020-12-31'),
(27897, 179, '2015-05-28', '2015-09-30'),
(27897, 179, '2017-03-11', '2017-04-30'),
(27897, 188, '2017-02-20', '2017-07-07'),
(27897, 179, '2019-10-10', '2020-10-10');
有了这些数据,计算每行孤岛的查询可以看起来像:
select *,
sum(hop) over(partition by UserId order by WorkPeriodStart) as island
from (
select *,
case when WorkPeriodStart > dateadd(day, 1, max(WorkPeriodEnd)
over(partition by UserId
order by WorkPeriodStart
rows between unbounded preceding and 1 preceding))
then 1 else 0 end as hop
from t
) x
order by UserId, WorkPeriodStart
结果:
UserId CompanyID WorkPeriodStart WorkPeriodEnd hop island
------ --------- --------------- ------------- --- ------
27809 972 2019-10-10 2020-10-10 0 0
27853 484 2019-10-10 2020-10-10 0 0
27856 172 2019-10-10 2020-10-10 0 0
27857 1234 2015-01-01 2015-12-31 0 0
27857 1234 2016-01-01 2017-02-28 0 0
27857 1234 2017-01-01 2017-12-31 0 0
27857 1234 2018-01-01 2018-12-31 0 0
27857 1234 2019-01-01 2020-01-31 0 0
27857 1234 2020-01-01 2020-12-31 0 0
27897 179 2015-05-28 2015-09-30 0 0
27897 188 2017-02-20 2017-07-07 1 1
27897 179 2017-03-11 2017-04-30 0 1
27897 179 2019-10-10 2020-10-10 1 2
现在,我们可以扩充这个查询以获得";工作日";对于每个岛;休息日";在每个岛屿之前,通过做:
select *,
datediff(day, s, e) + 1 as worked,
datediff(day, lag(e) over(partition by UserId order by island), s) as prev_days_off
from (
select UserId, island, min(WorkPeriodStart) as s, max(WorkPeriodEnd) as e
from (
select *,
sum(hop) over(partition by UserId order by WorkPeriodStart) as island
from (
select *,
case when WorkPeriodStart > dateadd(day, 1, max(WorkPeriodEnd)
over(partition by UserId
order by WorkPeriodStart
rows between unbounded preceding and 1 preceding))
then 1 else 0 end as hop
from t
) x
) y
group by UserId, island
) x
order by UserId, island
结果:
UserId island s e worked prev_days_off
------ ------ ---------- ---------- ------ -------------
27809 0 2019-10-10 2020-10-10 367 <null>
27853 0 2019-10-10 2020-10-10 367 <null>
27856 0 2019-10-10 2020-10-10 367 <null>
27857 0 2015-01-01 2020-12-31 2192 <null>
27897 0 2015-05-28 2015-09-30 126 <null>
27897 1 2017-02-20 2017-07-07 138 509
27897 2 2019-10-10 2020-10-10 367 825
这个结果非常接近你所需要的。这些数据实际上对于根据您的条件筛选行非常有用。
此脚本合并任何重叠的工作周期,然后计算前3年和5年内的总工作天数。然后取此值,并确定这是否超过UserId
和CompanyId
在3年期限内允许的最大工作日,以及仅超过UserId
在5年期限内所允许的最大上班日。(这是对你问题中规则的正确解释吗?(
然后,它只需将30
、60
和90
天添加到总数中,看看这个较大的值是否会超过相应的限制。考虑到不同的分组规则,这将是更干净的2个查询(对于5年规则,没有UserId
的重复(,但结果仍然是针对任何违规UserId
的标志。
在下面的例子中,你可以看到UserId = 27857
目前只违反了5年规则,但如果他们再呆60
天,也会违反3年规则。此外,UserId = 27858
目前还可以,但在60
天内将违反5年规则。
我已经对如何定义一年以及WorkPeriodEnd
值是否包含做出了一些假设,所以请检查所需的逻辑是否正确应用。
脚本
if object_id('tempdb..#sampleDates') is not null drop table #sampleDates
create table #sampleDates (UserId int, CompanyId int, WorkPeriodStart datetime, WorkPeriodEnd datetime)
insert #sampleDates values
(27809, 972, '2019-10-10', '2020-10-10')
,(27853, 484, '2019-10-10', '2020-10-10')
,(27856, 172, '2019-10-10', '2020-10-10')
,(27857, 1234, '2015-01-01', '2015-12-31')
,(27857, 1234, '2016-01-01', '2017-02-28')
,(27857, 1234, '2017-01-01', '2017-12-31')
,(27857, 1234, '2018-01-01', '2018-12-31')
,(27857, 1234, '2019-01-01', '2020-01-31')
,(27857, 1234, '2020-01-01', '2020-05-31')
,(27858, 1234, '2015-01-01', '2015-12-31')
,(27858, 1234, '2016-01-01', '2017-02-28')
,(27858, 1234, '2017-01-01', '2017-12-31')
,(27858, 1234, '2018-01-01', '2018-12-31')
,(27858, 1234, '2019-09-01', '2020-01-31')
,(27858, 1234, '2020-01-01', '2020-08-31')
,(27859, 12345, '2015-01-01', '2015-12-31')
,(27859, 12346, '2016-01-01', '2017-02-28')
,(27859, 12347, '2017-01-01', '2017-12-31')
,(27859, 12348, '2018-01-01', '2018-12-31')
,(27859, 12349, '2019-01-01', '2020-01-31')
,(27859, 12340, '2020-01-01', '2020-12-31')
,(27897, 179, '2019-10-10', '2020-10-10')
;
declare @3YearsAgo date = dateadd(year,-3,getdate());
declare @3YearWorkingDays int = (365*3)-30;
declare @5YearsAgo date = dateadd(year,-5,getdate());
declare @5YearWorkingDays int = (365*5)-(365/2);
with p as
(
select UserId
,CompanyId
,min(WorkPeriodStart) as WorkPeriodStart
,max(WorkPeriodEnd) as WorkPeriodEnd
from(select l.*,
sum(case when dateadd(day,1,l.PrevEnd) < l.WorkPeriodStart then 1 else 0 end) over (partition by l.UserId, l.CompanyId order by l.WorkPeriodStart rows unbounded preceding) as grp
from(select d.*,
lag(d.WorkPeriodEnd) over (partition by d.UserId, d.CompanyId order by d.WorkPeriodEnd) as PrevEnd
from #sampleDates as d
) as l
) as g
group by grp
,UserId
,CompanyId
)
,d as
(
select UserId
,CompanyId
,sum(case when @3YearsAgo < WorkPeriodEnd
then datediff(day
,case when @3YearsAgo between WorkPeriodStart and WorkPeriodEnd then @3YearsAgo else WorkPeriodStart end
,WorkPeriodEnd
)
else 0
end
) as WorkingDays3YearsToToday
,sum(case when @5YearsAgo < WorkPeriodEnd
then datediff(day
,case when @5YearsAgo between WorkPeriodStart and WorkPeriodEnd then @5YearsAgo else WorkPeriodStart end
,WorkPeriodEnd
)
else 0
end
) as WorkingDays5YearsToToday
from p
group by UserId
,CompanyId
)
select UserId
,CompanyId
,@3YearWorkingDays as Limit3Year
,@5YearWorkingDays as Limit5Year
,WorkingDays3YearsToToday
,WorkingDays5YearsToToday
,case when WorkingDays3YearsToToday > @3YearWorkingDays then 1 else 0 end as Violation3YearNow
,case when sum(WorkingDays5YearsToToday) over (partition by UserId) > @5YearWorkingDays then 1 else 0 end as Violation5YearNow
,case when WorkingDays3YearsToToday + 30 > @3YearWorkingDays then 1 else 0 end as Violation3Year30Day
,case when sum(WorkingDays5YearsToToday) over (partition by UserId) + 30 > @5YearWorkingDays then 1 else 0 end as Violation5Year30Day
,case when WorkingDays3YearsToToday + 60 > @3YearWorkingDays then 1 else 0 end as Violation3Year60Day
,case when sum(WorkingDays5YearsToToday) over (partition by UserId) + 60 > @5YearWorkingDays then 1 else 0 end as Violation5Year60Day
,case when WorkingDays3YearsToToday + 90 > @3YearWorkingDays then 1 else 0 end as Violation3Year90Day
,case when sum(WorkingDays5YearsToToday) over (partition by UserId) + 90 > @5YearWorkingDays then 1 else 0 end as Violation5Year90Day
from d
order by UserId
,CompanyId;
输出
+--------+-----------+------------+------------+--------------------------+--------------------------+-------------------+-------------------+---------------------+---------------------+---------------------+---------------------+---------------------+---------------------+
| UserId | CompanyId | Limit3Year | Limit5Year | WorkingDays3YearsToToday | WorkingDays5YearsToToday | Violation3YearNow | Violation5YearNow | Violation3Year30Day | Violation5Year30Day | Violation3Year60Day | Violation5Year60Day | Violation3Year90Day | Violation5Year90Day |
+--------+-----------+------------+------------+--------------------------+--------------------------+-------------------+-------------------+---------------------+---------------------+---------------------+---------------------+---------------------+---------------------+
| 27809 | 972 | 1065 | 1643 | 366 | 366 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 27853 | 484 | 1065 | 1643 | 366 | 366 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 27856 | 172 | 1065 | 1643 | 366 | 366 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 27857 | 1234 | 1065 | 1643 | 1029 | 1760 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 1 |
| 27858 | 1234 | 1065 | 1643 | 877 | 1608 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 |
| 27859 | 12340 | 1065 | 1643 | 365 | 365 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 |
| 27859 | 12345 | 1065 | 1643 | 0 | 147 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 |
| 27859 | 12346 | 1065 | 1643 | 0 | 424 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 |
| 27859 | 12347 | 1065 | 1643 | 147 | 364 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 |
| 27859 | 12348 | 1065 | 1643 | 364 | 364 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 |
| 27859 | 12349 | 1065 | 1643 | 395 | 395 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 |
| 27897 | 179 | 1065 | 1643 | 366 | 366 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+--------+-----------+------------+------------+--------------------------+--------------------------+-------------------+-------------------+---------------------+---------------------+---------------------+---------------------+---------------------+---------------------+
以下是我最终得到的结果。
<无用的解释>
我一直面临的问题是:
- 如何处理任何和所有日期范围的重叠,并仅确定合同日期范围内的天数
- 客户端仍然使用SQL 2008,所以我需要一些老式的tsql
- 确保准确计算中断时间(合同之间的时间(
所以我选择了自己的解决方案,这可能是愚蠢的,因为它需要在内存中为每个Workday/Candidate组合生成一个记录。我看不出合同表超出了5-10k的记录范围。我朝这个方向走的唯一原因。
我创建了一个日历表,其中包含1980年1月1日至2050年12月31日的每个日期然后我根据CandidateId的日历表加入了合同范围。这些将是工作日期。日历表中与合同范围内的日期不匹配的任何日期都是休息日。
<无用的解释>
日历表
if object_id('CalendarTable') is not null drop table CalendarTable
go
create table CalendarTable (pk int identity, CalendarDate date )
declare @StartDate date = cast('1980-01-01' as date)
declare @EndDate date = cast('2050-12-31' as date)
while @StartDate <= @EndDate
begin
insert into CalendarTable ( CalendarDate ) values ( @StartDate )
set @StartDate = dateadd(dd, 1, @StartDate)
end
go
查询5年违规(工作5年,没有6个月的冷静期(
declare @enddate date = dateadd(dd, 30, getdate())
declare @beginDate date = dateadd(dd, -180, dateadd(year, -5, getdate()))
select poss.CandidateId,
min(work.CalendarDate) as FirstWorkDate,
count(work.CandidateId) as workedDays,
sum(case when work.CandidateId is null then 1 else 0 end) as breakDays,
case when count(work.CandidateId) > (365*5) and sum(case when work.CandidateId is null then 1 else 0 end) < (365/2) then 1 else 0 end as Year5Violation,
case when count(work.CandidateId) > (365*5) and sum(case when work.CandidateId is null then 1 else 0 end) < (365/2) then DATEADD(year, 5, min(work.CalendarDate)) else null end as ViolationDate
from
(
select cand.CandidateId, cal.CalendarDate
from CalendarTable cal
join (select distinct c.CandidateId from contracts c where c.WorkPeriodStart is not null and c.WorkPeriodEnd is not null and c.Deleted = 0) cand on 1 = 1
where cal.CalendarDate between @beginDate and @enddate
) as poss
left join
(
select distinct c.CandidateId, cal.CalendarDate
from contracts c
join CalendarTable cal on cal.CalendarDate between c.WorkPeriodStart and c.WorkPeriodEnd
where c.WorkPeriodStart is not null and c.WorkPeriodEnd is not null and c.Deleted = 0
) as work on work.CandidateId = poss.CandidateId and work.CalendarDate = poss.CalendarDate
group by poss.CandidateId