我估计一阶差分的线性回归是直接从教科书中得出的:
- James H. Stock和Mark W. Watson, Introduction to Econometrics, Pearson,第4版。
- Christoph hank, Martin Arnold, Alexander Gerber, and Martin Schmelzer, Introduction to Econometrics with R, https://www.econometrics-with-r.org/10-rwpd.html
- 数据来自package AER, https://cran.r-project.org/web/packages/AER/AER.pdf
我用plm
包中的plm()
函数估计模型的一阶差,用broom
包中的augment()
函数提取残差。我得到一个错误信息,怀疑我可能没有正确使用"fd"
选项和/或滥用augment()
。model="pooling"
的类似尝试似乎奏效了。感谢帮助!
library(AER)
data(Fatalities)
Fatalities$fatality <- Fatalities$fatal / Fatalities$pop * 10000
library(plm)
library(broom)
plm.pool <- plm(fatality ~ beertax, data=Fatalities, model="pooling")
tidy(plm.pool) # ok
augment(plm.pool) # ok
plm.fd <- plm(fatality ~ beertax, data=Fatalities,
index=c("state", "year"),
model="fd")
tidy(plm.fd) # looks ok
# A tibble: 2 × 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) -0.00314 0.0119 -0.263 0.792
2 beertax 0.0137 0.285 0.0480 0.962
augment(plm.fd) # not ok
Error in `$<-.data.frame`(`*tmp*`, ".resid", value = c(`2` = 0.219840293582125, :
replacement has 288 rows, data has 336
In addition: Warning message:
In get(.Generic)(e1, e2) :
longer object length is not a multiple of shorter object length
EDIT: A WORKAROUND
所以我怀疑这个问题与plm
返回的模型和残差不具有相同的行数有关:
length(row.names(plm.fd$model))
is 336
length(names(plm.fd$residuals))
is 288.
有人能告诉我,如果以下是正确的方法来获得残差和拟合值从一差估计?
data.frame(".rownames" = row.names(plm.fd$model), plm.fd$model) %>%
left_join(data.frame(".rownames" = names(resid(plm.fd)),
".fitted" = fitted(plm.fd),
".resid" = resid(plm.fd)
)) -> Fatalities.augmented
head(Fatalities.augmented)
.rownames fatality beertax .fitted .resid
1 1 2.12836 1.539379 NA NA
2 2 2.34848 1.788991 0.0034166261 0.219840294
3 3 2.33643 1.714286 -0.0010225479 -0.007890716
4 4 2.19348 1.652542 -0.0008451287 -0.138968054
5 5 2.66914 1.609907 -0.0005835833 0.479380363
6 6 2.71859 1.560000 -0.0006831177 0.053269973
引用:
- https://cran.r-project.org/web/packages/plm/plm.pdf
- https://cran.r-project.org/web/packages/broom/broom.pdf
编辑参考:
- 使用broom::augment Panel数据模型
这是由于broom::augment_columns
中的一阶差分(FD)面板模型的误解或非特殊大小写造成的:该函数假设FD模型的残差与预测值具有相同的长度。
ret$.resid <- residuals0(x)
(https://github.com/tidymodels/broom/blob/069c21e903174fcf5d491091b7c347a9fdcd2999/R/utilities.R#L256)
FD模型压缩了数据,因此残差的数量少于用于模型估计的观测值的数量。您可以在summary
输出中看到:
summary(panel3) # FD model
Oneway (individual) effect First-Difference Model
[...]
Balanced Panel: n = 90, T = 7, N = 630
Observations used in estimation: 540
[...]
模型输入630个观测值,FD变换后,每组(单个维度)损失1个观测值,仅使用540个变换后的观测值->630 - 90 = 540.
broom:augment_columns
想把预测值(630)和残差(540)放在同一个数据帧中,这是注定要失败的。如果他们想这样做,他们可以用NA填充值(例如,每个个体的第一行设置为NA)。
我的建议是让开发人员/扫帚的维护者意识到这个问题(也许这篇文章)。plm的FD面板模型通过plm_object$args$model == "fd"
进行识别。