滚动回归残差



我希望你能帮我解决这个问题。我想在Python中对数据框架进行滚动回归,并仅计算残差的一部分的标准差。

例如:在下表中,我想基于一个移动窗口估计参数(例如,在X_1 =[1,2,4,5]和X_2 =[2,3,4,4]上Y =[5,7,9,10]),结果是截距= 2.4,B_1 = 0.7, B_2 = 1。这些估计值导致残差=[4.8,0.5,-0.2,-0.2],其中标准差是根据最后3个残差[0.5,-0.2,-0.2]来测量的,这些残差应该传递到["standard deviation"]列

<表类> 指数 Y X_1 X_2 标准差 tbody><<tr>05120.40414518817232.08166599929442.511132239310540.86440826441162南51455南61776南

下面是一个使用来自statmodels的rolllingols的工作示例。灵感来自于对分组滚动OLS回归和预测问题的回答。

from statsmodels.regression.rolling import RollingOLS
from statsmodels.tools.tools import add_constant
import statsmodels.api as sm
import pandas as pd
import numpy as np
data = sm.datasets.grunfeld.load()
df_grunfeld = pd.DataFrame(data.data)
df_grunfeld.set_index(['firm'], append=True, inplace=True)
# Simple Model
# $$invest = beta_0 + beta_1 value$$
def invest_params(df_gf, intercept=False):
"""
Function to operate on the data of a single firm.
Assumes df_gf has the columns 'invest' and 'value' available.
Returns a dataframe containing model parameters
"""
# we should have at least k + 1 observations
min_obs = 3 if intercept else 2
wndw = 8 
# if there are fewer than min_obs rows in df_gf, RollingOLS will throw an error
# Instead, handle this case separately
if df_gf.shape[0] < min_obs:
cols = ['coef_intercept', 'coef_value'] if intercept else ['coef_value']
return pd.DataFrame(index=df_gf.index, columns=cols)
y = df_gf['invest']
x = add_constant(df_gf['value']) if intercept else df_gf['value']
model = RollingOLS(y, x, expanding=True, min_nobs=min_obs, window=wndw).fit()
parameters = model.params
params_shifted = model.params.shift(1)
mse = model.mse_resid
parameters['invest_hat'] = (parameters.mul(add_constant(df_gf['value']), axis=0)
.sum(axis=1, min_count=1)).to_frame('invest_hat')
parameters['invest_hat_shift'] = (params_shifted.mul(add_constant(df_gf['value']), axis=0)
.sum(axis=1, min_count=1)).to_frame('invest_hat_shift')
parameters['mse'] = mse
parameters['rmse'] = np.sqrt(mse)
parameters['nobs'] = model.nobs
parameters['ssr'] = model.ssr
parameters['t_const'] = model.tvalues['const']
parameters['t_value'] = model.tvalues['value']
parameters.rename(columns = {'const' : 'b0', 'value' : 'b1'}, inplace = True)
parameters['r2_adj'] = model.rsquared_adj

return parameters
grouped = df_grunfeld.groupby('firm')
df_params = grouped.apply(lambda x: invest_params(x, True))
df_grunfeld_output = df_grunfeld.join(df_params, rsuffix='_coef')

相关内容

  • 没有找到相关文章

最新更新