数据插补方法最后观察结转(LOCF(是否在PostgreSQL中实现?
如果没有,我该如何实现此方法?
以下代码假定一个表tbl
,其中包含列a
、b
(键(、t
(时间(和v
(位置插补的值(:
create or replace function locf_s(a float, b float)
returns float
language sql
as '
select coalesce(b, a)
';
drop aggregate if exists locf(float);
CREATE AGGREGATE locf(FLOAT) (
SFUNC = locf_s,
STYPE = FLOAT
);
select a,b,t,v,
locf(v) over (PARTITION by a,b ORDER by t) as v_locf
from tbl
order by a,b,t
;
(SQLFiddle(
有关教程:"LOCF 和 Linear Imputation with PostgreSQL">
我直接基于链接文章中的表和数据
。create table test (
unit integer not null
check (unit >= 1),
obs_time integer not null
check (obs_time >= 1),
obs_value numeric(5, 1),
primary key (unit, obs_time)
);
insert into test values
(1, 1, 3.8), (1, 2, 3.1), (1, 3, 2.0),
(2, 1, 4.1), (2, 2, 3.5), (2, 3, 3.8), (2, 4, 2.4), (2, 5, 2.8), (2, 6, 3.0),
(3, 1, 2.7), (3, 2, 2.4), (3, 3, 2.9), (3, 4, 3.5);
对于链接文章中的六个观察结果,我们需要"单位"和"obs_time"的所有可能组合。
select distinct unit, times.obs_time
from test
cross join (select generate_series(1, 6) obs_time) times;
单元obs_time--1 11 21 31 41 51 62 1. . .3 6
我们还需要知道哪一行具有每个单元的最后一个观测值。
select unit, max(obs_time) obs_time
from test
group by unit
order by unit;
单元obs_time--1 32 63 4
知道了这两组,我们可以连接并合并以获得最后的观察结果并将其向前推进。
with unit_times as (
select distinct unit, times.obs_time
from test
cross join (select generate_series(1, 6) obs_time) times
), last_obs_time as (
select unit, max(obs_time) obs_time
from test
group by unit
)
select t1.unit, t1.obs_time,
coalesce(t2.obs_value, (select obs_value
from test
inner join last_obs_time
on test.unit = last_obs_time.unit
and test.obs_time = last_obs_time.obs_time
where test.unit = t1.unit)) obs_value
from unit_times t1
left join test t2
on t1.unit = t2.unit and t1.obs_time = t2.obs_time
order by t1.unit, t1.obs_time;
单元obs_time obs_value--1 1 3.81 2 3.11 3 2.01 4 2.01 5 2.01 6 2.02 1 4.1. . .3 4 3.53 5 3.53 6 3.5
要获得与链接文章显示的相同的视觉输出,请使用 tablefunc 模块中的 crosstab(( 函数。您也可以使用应用程序代码进行该操作。