数据集如下:这是一个名为-activity 的员工登录活动数量的示例数据集
我需要计算一些指标,在python数据帧中可以做到,但在mySQL中是新的。
-
2018年1月部门每天活跃的平均员工人数是多少(能够做到一半,但结果不正确。
-
2018年1月,每个dept_id(能够做到(每月唯一的活跃员工数量(登录>0(
-
从2017年12月到2018年1月,所有dept_id的月环比增长,其中至少有一名员工处于活动状态(登录>0(-不知道如何在sql中做到这一点
-
2017年12月活跃在每个dept_id,2018年1月也活跃在同一dept_id的用户比例
-
2018年1月有多少员工连续3天或以上登录
如有任何帮助,我们将不胜感激。
针对案例1:编写的查询
select dept_id,
DAU
from
(
select dept_id
, month(date)
, year(date)
, avg(logins) as DAU
from
(select * from activity where login >0)
where year(date) =2018
and month(date) =1
group by dept_id, month(date), year(date)
)
文本格式数据集
date dept_id emp_id logins
29-11-2017 ABC001 A1 1
30-11-2017 ABC002 A2 2
01-12-2017 XYZ001 A3 0
01-12-2017 XYZ002 A4 1
03-12-2017 ABC001 D2 4
04-12-2017 ABC002 D1 1
05-12-2017 XYZ001 A6 2
05-12-2017 XYZ002 A7 3
30-12-2017 ABC001 A8 0
01-01-2018 ABC002 A2 6
02-01-2018 XYZ001 A10 4
03-01-2018 XYZ002 A11 2
04-01-2018 ABC001 A1 2
04-01-2018 ABC002 A2 0
05-01-2018 XYZ001 A13 4
05-01-2018 XYZ001 A6 2
05-01-2018 XYZ002 A7 1
06-01-2018 XYZ001 A6 2
06-01-2018 XYZ002 A7 3
07-01-2018 XYZ001 A6 3
07-01-2018 XYZ002 A7 4
06-01-2018 XYZ002 A14 3
30-01-2018 ABC001 A15 2
如果这样做有效,请告诉我,否则我将更新答案,因为我没有安装MYSQL,所以无法检查。
日期在oracle中是一个关键字,但在MYSQL中不确定,所以在引号中使用它,比如";日期";。
案例1:
SELECT dept_id,
AVG(cnt) average_emp
FROM (SELECT dept_id,
days,
COUNT(emp_id) cnt
FROM (SELECT dept_id,
emp_id,
SUM(logins) logins,
to_char(DATES, 'dd') days
FROM mytable
WHERE to_char(DATES,'mmyyyy') = '012018'
GROUP BY dept_id,
emp_id,
to_char(DATES, 'dd') )
WHERE logins > 0
GROUP BY dept_id,
days )
GROUP BY dept_id;
案例2:
SELECT dept_id,
COUNT(emp_id)
FROM (SELECT dept_id,
emp_id,
SUM(logins) logins
FROM mytable
WHERE to_char(DATES,'mmyyyy') = '012018'
GROUP BY dept_id,
emp_id )
WHERE logins > 0
GROUP BY dept_id;
案例3:
SELECT months,
users,
ROUND( (users - nvl(LAG(users) OVER (ORDER BY rownum),users) ) / nvl(LAG(users) OVER (ORDER BY ROWNUM), 1)
, 2) growth_rate
FROM (SELECT to_char(mt.DATES, 'MON-YYYY') months,
count(mt.EMP_ID) users
FROM (SELECT *
FROM MYTABLE
ORDER BY DATES) mt
WHERE mt.DATES >= to_date('DEC-2017', 'MON-YYYY')
AND mt.DATES <= to_date('JAN-2018', 'MON-YYYY')
GROUP BY to_char(mt.DATES, 'MON-YYYY')
ORDER BY to_date(months, 'MON-YYYY') ) oq
WHERE exists(SELECT 1
FROM MYTABLE iq
WHERE to_char(iq.DATES, 'MON-YYYY') = oq.months
AND iq.EMP_ID IN (SELECT EMP_ID
FROM MYTABLE
WHERE iq.LOGINS > 0) );
案例4:
SELECT dept_id,
emp_id
FROM (SELECT dept_id,
emp_id
FROM mytable
WHERE to_char(DATES,'mmyyyy') = '122017'
AND logins > 0
GROUP BY dept_id,
emp_id )
INTERSECT
SELECT dept_id,
emp_id
FROM (SELECT dept_id,
emp_id
FROM mytable
WHERE to_char(DATES,'mmyyyy') = '012018'
AND logins > 0
GROUP BY dept_id,
emp_id )
案例5:
-- not full proof
SELECT COUNT(*) emp_cnt
FROM (SELECT emp_id,
DENSE_RANK() OVER(ORDER BY DATES) rn,
COUNT(*) OVER(PARTITION BY emp_id ORDER BY DATES) cnt
FROM mytable
WHERE to_char(DATES,'mmyyyy') = '012018'
AND logins > 0
ORDER BY rn,
cnt )
WHERE rn = cnt
AND rn >= 3;