r语言 - 在调用feols回归时如何处理变量名中的特殊字符?



我试图写一个函数来返回FE回归系数和标准误差,因为我需要运行大量的回归。数据可能是这样的。列名中有许多特殊字符,如空格、-、&和数字等。

library(data.table)
library(fixest)
library(broom)
data<-data.table(Date = c("2020-01-01","2020-01-01","2020-01-01","2020-01-01","2020-02-01","2020-02-01","2020-02-01","2020-02-01"),
Card = c(1,2,3,4,1,2,3,4),
A = rnorm(8),
B = rnorm(8),
C = rnorm(8),
D = rnorm(8)
)
setnames(data, old = "A", new = "A-A")
setnames(data, old = "B", new = "B B")
setnames(data, old = "C", new = "C&C")
setnames(data, old = "D", new = "1-D")

感谢@Ronak Shah和@Laurent berg,他们提供了以下两个很好的候选人

estimation_fun <- function(col1,col2,df) {
regression<-feols(as.formula(sprintf('%s ~ %s | Card + Date', col1, col2)), df)
est =tidy(regression)$estimate
se = tidy(regression)$std.error
output <- list(est,se)
return(output)
}

estimation_fun <- function(lhs, rhs, df) {
regression<-feols(.[col1] ~ .[col2] | Card + Date, df)
est =tidy(regression)$estimate
se = tidy(regression)$std.error
output <- list(est,se)
return(output)
}

如果列名只是"A", "B", "C"等,它们都可以工作。但是,请尝试这个函数

estimation_fun("A-A","B B",data)
Error in feols(as.formula(sprintf("%s ~ %s | Card + Date", col1, col2)), : 
Argument 'fml' could not be evaluated: <text>:1:9: unexpected symbol
1: A-A ~ B B
^

我正在寻找一个feols公式格式,可以处理这种情况。或者欢迎任何建议,即直接删除列名中的这些特殊字符。(但这将是第二好的)

感谢这里伟大的社区!

考虑将特殊字符更改为_

setnames(data, gsub("[-& ]", "_", names(data)))
setnames(data, make.names(names(data)))

-check data

> data
Date Card         A_A        B_B         C_C        X1_D
1: 2020-01-01    1  0.19083908  0.4835800 -0.08755933  1.01311944
2: 2020-01-01    2 -0.57726617  0.6421043  1.12987445 -0.52168711
3: 2020-01-01    3  2.02653159 -1.4505543 -0.43367868 -0.04474157
4: 2020-01-01    4 -0.20575821  0.4691786 -1.58562690  0.49362528
5: 2020-02-01    1 -0.03461155 -0.2913712 -0.16457341 -0.07701185
6: 2020-02-01    2 -0.50734472 -0.7545768 -0.53227356  0.46468144
7: 2020-02-01    3  0.76653913 -0.1634451  1.00350319  0.25886312
8: 2020-02-01    4  0.33414436  0.6395322  1.10383819 -1.08479631

测试

estimation_fun('A_A', 'B_B', data)
[[1]]
[1] -0.3915516
attr(,"type")
[1] "Clustered (Card)"
[[2]]
[1] 0.2658773
attr(,"type")
[1] "Clustered (Card)"

通常反引号有效,但对于feols,它是坏的。因此,安全的选择是使用janitor中的clean_names或使用gsub将特殊字符替换为_

相关内容

  • 没有找到相关文章

最新更新