我复制了linearmodels PanelOLS简介中的一个示例,并包含了健壮的标准错误,以学习如何使用该模块。这是我使用的代码
from linearmodels.datasets import jobtraining
import statsmodels.api as sm2
data = jobtraining.load()
mi_data = data.set_index(['fcode', 'year'])
mi_data.head()
from linearmodels import PanelOLS
mod = PanelOLS(mi_data.lscrap, sm2.add_constant(mi_data.hrsemp), entity_effects=True)
print(mod.fit(cov_type='robust'))
PanelOLS Estimation Summary
================================================================================
Dep. Variable: lscrap R-squared: 0.0528
Estimator: PanelOLS R-squared (Between): -0.0029
No. Observations: 140 R-squared (Within): 0.0528
Date: Tue, May 05 2020 R-squared (Overall): 0.0048
Time: 10:49:58 Log-likelihood -90.459
Cov. Estimator: Robust
F-statistic: 5.0751
Entities: 48 P-value 0.0267
Avg Obs: 2.9167 Distribution: F(1,91)
Min Obs: 1.0000
Max Obs: 3.0000 F-statistic (robust): 8.2299
P-value 0.0051
Time periods: 3 Distribution: F(1,91)
Avg Obs: 46.667
Min Obs: 46.000
Max Obs: 48.000
Parameter Estimates
==============================================================================
Parameter Std. Err. T-stat P-value Lower CI Upper CI
------------------------------------------------------------------------------
const 0.4982 0.0555 8.9714 0.0000 0.3879 0.6085
hrsemp -0.0054 0.0019 -2.8688 0.0051 -0.0092 -0.0017
==============================================================================
F-test for Poolability: 17.094
P-value: 0.0000
Distribution: F(47,91)
Included effects: Entity
当我将结果与我使用稳健标准误差进行固定效应回归的方式进行比较时,我发现标准误差非常不同。
xtset fcode year
xtreg lscrap hrsemp , fe vce(robust)
Fixed-effects (within) regression Number of obs = 140
Group variable: fcode Number of groups = 48
R-sq: within = 0.0528 Obs per group: min = 1
between = 0.0002 avg = 2.9
overall = 0.0055 max = 3
F(1,47) = 7.93
corr(u_i, Xb) = -0.0266 Prob > F = 0.0071
(Std. Err. adjusted for 48 clusters in fcode)
------------------------------------------------------------------------------
| Robust
lscrap | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
hrsemp | -.0054186 .0019243 -2.82 0.007 -.0092897 -.0015474
_cons | .4981764 .0295415 16.86 0.000 .4387464 .5576063
-------------+----------------------------------------------------------------
sigma_u | 1.4004191
sigma_e | .57268937
rho | .85672692 (fraction of variance due to u_i)
------------------------------------------------------------------------------
我不明白差异是从哪里来的,因为如果没有健壮的SE,结果(几乎(是相同的。我怎么能像Stata中那样使用Python线性模型来使用同样的健壮SE呢。PanelOLS?
White的稳健协方差在Python中与cov_type='robust'
选项一起使用,对于固定效果模型来说并不稳健。您应该使用cov_type='robust',cluster_entity=True
。以下是线性模型中相应的手动输入。
完整代码:
from linearmodels.datasets import jobtraining
import statsmodels.api as sm2
data = jobtraining.load()
mi_data = data.set_index(['fcode', 'year'])
mi_data.head()
from linearmodels import PanelOLS
mod = PanelOLS(mi_data.lscrap, sm2.add_constant(mi_data.hrsemp), entity_effects=True)
print(mod.fit(cov_type='robust',cluster_entity=True))
相应的输出几乎与Stata的输出类似:
PanelOLS Estimation Summary
================================================================================
Dep. Variable: lscrap R-squared: 0.0528
Estimator: PanelOLS R-squared (Between): -0.0029
No. Observations: 140 R-squared (Within): 0.0528
Date: Tue, May 05 2020 R-squared (Overall): 0.0048
Time: 18:53:06 Log-likelihood -90.459
Cov. Estimator: Robust
F-statistic: 5.0751
Entities: 48 P-value 0.0267
Avg Obs: 2.9167 Distribution: F(1,91)
Min Obs: 1.0000
Max Obs: 3.0000 F-statistic (robust): 8.2299
P-value 0.0051
Time periods: 3 Distribution: F(1,91)
Avg Obs: 46.667
Min Obs: 46.000
Max Obs: 48.000
Parameter Estimates
==============================================================================
Parameter Std. Err. T-stat P-value Lower CI Upper CI
------------------------------------------------------------------------------
const 0.4982 0.0555 8.9714 0.0000 0.3879 0.6085
hrsemp -0.0054 0.0019 -2.8688 0.0051 -0.0092 -0.0017
==============================================================================
F-test for Poolability: 17.094
P-value: 0.0000
Distribution: F(47,91)
Included effects: Entity