使用大小写和聚合语句将值拆分为四分位数



我有一个如下所示的员工数据集。开始日期是雇员的雇用日期,结束日期是雇员离开雇主的日期。两者都是日期数据类型。

Gender   StartDate       EndDate
M       2010-04-30      2013-06-18
F       2010-01-09      2015-06-19
M       2009-09-08      2014-08-13

我想根据平均就业月数将员工数据分为四分位数。结果还包括员工总数(员工列(、男性员工百分比、女性员工百分比和平均就业月数。请在下面找到预期结果:

Quartile    Employee    %Male   %Female   AvgMonths
1             20        60.00    40.00     8.75
2             25        50.00    50.00     28.5
3             10        40.00    60.00     41.25

我想根据工作的月数获得 25%、50% 和 75% 的四分位数,并取平均值以获得平均月数。

下面请找到我的查询,我不知道在哪里可以将四分位数计算添加到查询中。

declare @current date;
set @current='2012-12-31';
select count(*) as Employees,
cast(cast(AVG(CASE WHEN Gender = 'M' THEN 1.0 ELSE 0 END)*100 as decimal(18,2)) as nvarchar(5)) +'%' as Male,
cast(cast(AVG(CASE WHEN Gender = 'F' THEN 1.0 ELSE 0 END)*100 as decimal(18,2)) as nvarchar(5)) +'%' as Female,
AVG(CASE WHEN EndDate is null then DATEDIFF(MONTH, StartDate, @current)
when EndDate is not null then DATEDIFF(MONTH, StartDate, EndDate)
end) as AvgMonths
from dbo.DimEmployee

-----------------------更新-------------------------------------

我自己想通了。 请在下面找到代码:

declare @current date;
set @current='2012-12-31';
select count(*) as Employees,
cast(cast(AVG(CASE WHEN Gender = 'M' THEN 1.0 ELSE 0 END)*100 as decimal(18,2)) as nvarchar(10)) +'%' as Male,
cast(cast(AVG(CASE WHEN Gender = 'F' THEN 1.0 ELSE 0 END)*100 as decimal(18,2)) as nvarchar(10)) +'%' as Female,
AVG(t.EmployedMonths) as AvgMonths,
Ntile(3) over (order by t.EmployedMonths asc) as Quartiles
from
(select EmployeeKey, Gender,
CASE WHEN EndDate is null then abs(DATEDIFF(MONTH, StartDate, @current))
when EndDate is not null then abs(DATEDIFF(MONTH, StartDate, EndDate))
end as EmployedMonths
from dbo.DimEmployee)t
group by t.EmployedMonths

您可以使用rank()count(*)作为窗口函数来计算四分位数:

select floor( rnk * 4.0 / cnt ) as quartile,
count(*) as Employees,
cast(cast(AVG(CASE WHEN Gender = 'M' THEN 1.0 ELSE 0 END)*100 as decimal(18,2)) as nvarchar(5)) + '%' as Male,
cast(cast(AVG(CASE WHEN Gender = 'F' THEN 1.0 ELSE 0 END)*100 as decimal(18,2)) as nvarchar(5)) + '%' as Female,
AVG(num_months * 1.0) as AvgMonths
from (select e.*,
rank() over (partition by datediff(month, startdate, coalesce(enddate, @current))) - 1 as rnk,
count(*) over () as cnt,
datediff(month, startdate, coalesce(enddate, @current)) as num_months
from dbo.DimEmployee e
) e
group by floor( rnk * 4.0 / cnt )

最新更新