>假设有如下记录:
Employee_id, work_start_date, work_end_date
1, 01-jan-2014, 07-jan-2014
1, 03-jan-2014, 12-jan-2014
1, 23-jan-2014, 25-jan-2014
2, 15-jan-2014, 25-jan-2014
2, 07-jan-2014, 15-jan-2014
2, 09-jan-2014, 12-jan-2014
要求是编写一个 SQL select 语句,该语句将汇总按employee_id分组的工作日,但排除重叠的时间段(这意味着 - 仅将它们计算一次)。
所需的输出将是:
Employee_id, worked_days
1, 13
2, 18
日期范围内工作日的计算如下:如果 work_start_date = 5 且 work_end_date = 9,则 worked_days = 4 (9 - 5)。
我可以编写一个 pl/sql 函数来解决这个问题(手动迭代记录并进行计算),但我相信可以使用 SQL 来完成以获得更好的性能。
有人可以指出我正确的方向吗?
谢谢!
这是一个来自类似问题的略微修改的查询:
计算与重叠日期范围关联的值的总和
SELECT "Employee_id",
SUM( "work_end_date" - "work_start_date" )
FROM(
SELECT "Employee_id",
"work_start_date" ,
lead( "work_start_date" )
over (Partition by "Employee_id"
Order by "Employee_id", "work_start_date" )
As "work_end_date"
FROM (
SELECT "Employee_id", "work_start_date"
FROM Table1
UNION
SELECT "Employee_id","work_end_date"
FROM Table1
) x
) x
WHERE EXISTS (
SELECT 1 FROM Table1 t
WHERE t."work_start_date" > x."work_end_date"
AND t."work_end_date" > x."work_start_date"
OR t."work_start_date" = x."work_start_date"
AND t."work_end_date" = x."work_end_date"
)
GROUP BY "Employee_id"
;
演示:http://sqlfiddle.com/#!4/4fcce/2
这是一个棘手的问题。 例如,您不能使用 lag()
,因为重叠周期可能不是"前一个"周期。 或者不同的时期可以在同一天开始和/或停止。
这个想法是重建时期。 怎么做? 查找周期开始的记录 - 也就是说,与其他任何记录都没有重叠。 然后将其用作标志并累积计算此标志以计算重叠组。 然后获取工作日只是从那里聚合:
with ps as (
select e.*,
(case when exists (select 1
from emps e2
where e2.employee_id = e.employee_id and
e2.work_start_date <= e.work_start_date and
e2.work_end_date >= e.work_end_date
)
then 0 else 1
) as IsPeriodStart
from emps e
)
select employee_id, sum(work_end_date - work_start_date) as Days_Worked
from (select employee_id, min(work_start_date) as work_start_date,
max(work_end_date) as work_end_date
from (select ps.*,
sum(IsPeriod_Start) over (partition by employee_id
order by work_start_date
) as grp
from ps
) ps
group by employee_id, grp
) ps
group by employee_id;
date_tbl类型
create or replace package RG_TYPE is type date_tbl is table of date; end;
函数(结果为一个表格,日期介于 2 个参数之间)
create or replace function dates ( p_from date, p_to date ) return rg_type.date_tbl pipelined is l_idx date:=p_from; begin loop if l_idx>nvl(p_to,p_from) then exit; end if; pipe row(l_idx); l_idx:=l_idx+1; end loop; return; end;
.SQL:
select employee_id,sum(c) from (select e.employee_id,d.column_value,count(distinct w.employee_id) as c from (select distinct employee_id from works) e, table(dates((select min(work_start_date) as a from works),(select max(work_end_date) as b from works))) d, works w where e.employee_id=w.employee_id and d.column_value>=w.work_start_date and d.column_value<w.work_end_date group by e.employee_id,d.column_value) Sub group by employee_id order by 1,2