我有此表:
create table t (value int, dt date);
value | dt
-------+------------
10 | 2012-10-30
15 | 2012-10-29
null | 2012-10-28
null | 2012-10-27
7 | 2012-10-26
我想要此输出:
value | dt
-------+------------
10 | 2012-10-30
5 | 2012-10-29
5 | 2012-10-28
5 | 2012-10-27
7 | 2012-10-26
我希望null值以及一个先前的非零值,在按日期下达订购时,被上一个非零值的平均值代替。在此示例中,值15是下两个nulls的先前而不是零值。因此15/3 = 5。
SQL小提琴
我找到了一个令人惊讶的简单解决方案:
SELECT max(value) OVER (PARTITION BY grp)
/ count(*) OVER (PARTITION BY grp) AS value
,dt
FROM (
SELECT *, count(value) OVER (ORDER BY dt DESC) AS grp
FROM t
) a;
-> sqlfiddle
由于count()
忽略了NULL
值,因此您可以使用运行计数(默认窗口函数中的默认计数)来快速组值( -> grp
)。
每个组都具有 ,因此我们可以使用min/max/sum在另一个窗口函数中获得相同的结果。除以grp
中的成员数(这次count(*)
,以计数NULL
值!),我们完成了。
作为难题,这是一个解决方案...实际上,它可能会根据数据的性质执行可怕的作用。在任何情况下都观看您的索引:
create database tmp;
create table t (value float, dt date); -- if you use int, you need to care about rounding
insert into t values (10, '2012-10-30'), (15, '2012-10-29'), (null, '2012-10-28'), (null, '2012-10-27'), (7, '2012-10-26');
select t1.dt, t1.value, t2.dt, t2.value, count(*) cnt
from t t1, t t2, t t3
where
t2.dt >= t1.dt and t2.value is not null
and not exists (
select *
from t
where t.dt < t2.dt and t.dt >= t1.dt and t.value is not null
)
and t3.dt <= t2.dt
and not exists (
select *
from t where t.dt >= t3.dt and t.dt < t2.dt and t.value is not null
)
group by t1.dt;
+------------+-------+------------+-------+-----+
| dt | value | dt | value | cnt |
+------------+-------+------------+-------+-----+
| 2012-10-26 | 7 | 2012-10-26 | 7 | 1 |
| 2012-10-27 | NULL | 2012-10-29 | 15 | 3 |
| 2012-10-28 | NULL | 2012-10-29 | 15 | 3 |
| 2012-10-29 | 15 | 2012-10-29 | 15 | 3 |
| 2012-10-30 | 10 | 2012-10-30 | 10 | 1 |
+------------+-------+------------+-------+-----+
5 rows in set (0.00 sec)
select dt, value/cnt
from (
select t1.dt , t2.value, count(*) cnt
from t t1, t t2, t t3
where
t2.dt >= t1.dt and t2.value is not null
and not exists (
select *
from t
where t.dt < t2.dt and t.dt >= t1.dt and t.value is not null
)
and t3.dt <= t2.dt
and not exists (
select *
from t
where t.dt >= t3.dt and t.dt < t2.dt and t.value is not null
)
group by t1.dt
) x;
+------------+-----------+
| dt | value/cnt |
+------------+-----------+
| 2012-10-26 | 7 |
| 2012-10-27 | 5 |
| 2012-10-28 | 5 |
| 2012-10-29 | 5 |
| 2012-10-30 | 10 |
+------------+-----------+
5 rows in set (0.00 sec)
说明:
- T1是原始表
- t2是表中的行,最小日期,非零值
- t3都在介于两者之间,因此我们可以由其他人分组并计数
对不起,我不能更清楚。这也让我感到困惑: - )