我在oracle中看到了许多滚动平均数的例子,但确实做到了我想要的。
这是我的原始数据
DATE SCORE AREA
----------------------------
01-JUL-14 60 A
01-AUG-14 45 A
01-SEP-14 45 A
02-SEP-14 50 A
01-OCT-14 30 A
02-OCT-14 45 A
03-OCT-14 50 A
01-JUL-14 60 B
01-AUG-14 45 B
01-SEP-14 45 B
02-SEP-14 50 B
01-OCT-14 30 B
02-OCT-14 45 B
03-OCT-14 50 B
这是我的滚动平均的期望结果
MMYY AVG AREA
-------------------------
JUL-14 60 A
AUG-14 52.5 A
SEP-14 50 A
OCT-14 44 A
JUL-14 60 B
AUG-14 52.5 B
SEP-14 50 B
OCT-14 44 B
我需要它的工作方式是,对于每个MMYY,我需要回顾3个月,并平均每个部门的分数。例如,
对于OCT中的A区,从10月起的最后3个月内,共有6项研究,(45+45+50+30+45+50)/6=44.1
通常我会这样写查询
SELECT
AREA,
TO_CHAR(T.DT,'MMYY') MMYY,
ROUND(AVG(SCORE)
OVER (PARTITION BY AREA ORDER BY TO_CHAR(T.DT,'MMYY') ROWS BETWEEN 2 PRECEDING AND CURRENT ROW),1)
AS AVG
FROM T
这将查看最近3个企业,而不是最近3个月
实现这一点的一种方法是将聚合函数与分析函数混合使用。平均的关键思想是避免使用avg()
,而是使用sum()
除以count(*)
。
SELECT AREA, TO_CHAR(T.DT, 'MMYY') AS MMYY,
SUM(SCORE) / COUNT(*) as AvgScore,
SUM(SUM(SCORE)) OVER (PARTITION BY AREA ORDER BY MAX(T.DT) ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) / SUM(COUNT(*)) OVER (PARTITION BY AREA ORDER BY MAX(T.DT) ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
FROM t
GROUP BY AREA, TO_CHAR(T.DT, 'MMYY') ;
注意order by
子句。如果您的数据跨越数年,那么使用MMYY格式会带来问题。月份最好使用YYYY-MM这样的格式,因为字母顺序与自然顺序相同。
您还可以指定范围,而不仅仅是行。
SELECT
AREA,
TO_CHAR(T.DT,'MMYY') MMYY,
ROUND(AVG(SCORE)
OVER (PARTITION BY AREA
ORDER BY DT RANGE BETWEEN INTERVAL '3' MONTH PRECEDING AND CURRENT ROW))
AS AVG
FROM T
由于CURRENT ROW
是默认值,因此只有ORDER BY DT RANGE INTERVAL '3' MONTH PRECEDING
也可以工作。也许你必须做一些微调,我没有测试关于每月28/29/30/31天问题的行为。
有关更多详细信息,请查看Oracle Windowing子句。
SQL> WITH DATA AS(
2 SELECT to_date('01-JUL-14','DD-MON-RR') dt, 60 score, 'A' area FROM dual UNION ALL
3 SELECT to_date('01-AUG-14','DD-MON-RR') dt, 45 score, 'A' area FROM dual UNION ALL
4 SELECT to_date('01-SEP-14','DD-MON-RR') dt, 45 score, 'A' area FROM dual UNION ALL
5 SELECT to_date('02-SEP-14','DD-MON-RR') dt, 50 score, 'A' area FROM dual UNION ALL
6 SELECT to_date('01-OCT-14','DD-MON-RR') dt, 30 score, 'A' area FROM dual UNION ALL
7 SELECT to_date('02-OCT-14','DD-MON-RR') dt, 45 score, 'A' area FROM dual UNION ALL
8 SELECT to_date('03-OCT-14','DD-MON-RR') dt, 50 score, 'A' area FROM dual UNION ALL
9 SELECT to_date('01-JUL-14','DD-MON-RR') dt, 60 score, 'B' area FROM dual UNION ALL
10 SELECT to_date('01-AUG-14','DD-MON-RR') dt, 45 score, 'B' area FROM dual UNION ALL
11 SELECT to_date('01-SEP-14','DD-MON-RR') dt, 45 score, 'B' area FROM dual UNION ALL
12 SELECT to_date('02-SEP-14','DD-MON-RR') dt, 50 score, 'B' area FROM dual UNION ALL
13 SELECT to_date('01-OCT-14','DD-MON-RR') dt, 30 score, 'B' area FROM dual UNION ALL
14 SELECT to_date('02-OCT-14','DD-MON-RR') dt, 45 score, 'B' area FROM dual UNION ALL
15 SELECT to_date('03-OCT-14','DD-MON-RR') dt, 50 score, 'B' area FROM dual)
16 SELECT TO_CHAR(T.DT, 'MON-RR') AS MMYY,
17 round(
18 SUM(SUM(SCORE)) OVER (PARTITION BY AREA ORDER BY MAX(T.DT) ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)/
19 SUM(COUNT(*)) OVER (PARTITION BY AREA ORDER BY MAX(T.DT) ROWS BETWEEN 2 PRECEDING AND CURRENT ROW),1)
20 AS avg_score,
21 AREA
22 FROM data t
23 GROUP BY AREA, TO_CHAR(T.DT, 'MON-RR')
24 /
MMYY AVG_SCORE A
------ ---------- -
JUL-14 60 A
AUG-14 52.5 A
SEP-14 50 A
OCT-14 44.2 A
JUL-14 60 B
AUG-14 52.5 B
SEP-14 50 B
OCT-14 44.2 B
8 rows selected.
SQL>
从下一次开始,我希望您提供create
和insert
语句,这样我们就不必花时间准备test case
了。
为什么是YY
格式?你没看到Y2K
的错误吗?请使用YYYY
格式。