让我们创建一个包含假数据的矩阵:
data_ex <- data.frame(y = runif(5,0,1), a1 = runif(5,0,1), b2 = runif(5,0,1),
c3 = runif(5,0,1), d4 = runif(5,0,1))
> data_ex
y a1 b2 c3 d4
1 0.162 0.221 0.483 0.989 0.558
2 0.445 0.854 0.732 0.723 0.259
3 0.884 0.041 0.893 0.985 0.947
4 0.944 0.718 0.338 0.238 0.592
5 0.094 0.867 0.026 0.334 0.314
模型的公式如下:
forml <- as.formula("y ~ a1 + b2 + a1:c3:d4 + a1:c3 + a1:b2 + a1:b2:c3")
> forml
y ~ a1 + b2 + a1:c3:d4 + a1:c3 + a1:b2 + a1:b2:c3
由此产生的model.matrix
为:
> as.matrix(model.matrix(forml, data_ex))
(Intercept) a1 b2 a1:c3 a1:b2 a1:c3:d4 a1:b2:c3
1 1 0.221 0.483 0.218 0.107 0.122 0.105
2 1 0.854 0.732 0.617 0.625 0.160 0.452
3 1 0.041 0.893 0.040 0.036 0.038 0.036
4 1 0.718 0.338 0.171 0.243 0.101 0.058
5 1 0.867 0.026 0.290 0.022 0.091 0.008
如您所见,列从最低交互等级到最高交互等级重新排序。 我正在寻找一种强制model.matrix
函数遵循公式中项的确切顺序的方法。 生成的矩阵应如下所示:
> Correct_matrix
(Intercept) a1 b2 a1:c3:d4 a1:c3 a1:b2 a1:b2:c3
1 1 0.221 0.107 0.483 0.218 0.122 0.105
2 1 0.854 0.625 0.732 0.617 0.160 0.452
3 1 0.041 0.036 0.893 0.040 0.038 0.036
4 1 0.718 0.243 0.338 0.171 0.101 0.058
5 1 0.867 0.022 0.026 0.290 0.091 0.008
您可以创建terms
并使用keep.order = TRUE
保持术语的顺序。生成的对象可以与model.matrix
一起使用。
model.matrix(terms(forml, keep.order = TRUE), data_ex)
结果:
(Intercept) a1 b2 a1:c3:d4 a1:c3 a1:b2 a1:b2:c3
1 1 0.4604044 0.10968326 0.198301034 0.3015807 0.05049866 0.03307836
2 1 0.4795555 0.61339588 0.018934135 0.2205621 0.29415737 0.13529189
3 1 0.7560366 0.67036486 0.001418541 0.4465991 0.50682035 0.29938436
4 1 0.4490247 0.69179890 0.135388984 0.1376586 0.31063480 0.09523209
5 1 0.7198557 0.08595737 0.131564438 0.2918157 0.06187690 0.02508371
attr(,"assign")
[1] 0 1 2 3 4 5 6