如何在 R 中模拟用于随机森林的分类数据?

我想知道如何模拟一些可以在随机森林中用于 R 分类的数据？

如果是为了回归，我会做这样的事情：

n <- 1000
p <- 3
e <- rnorm(n)
b <- 10
xVal <- matrix(rnorm(n*p), nrow=n)    # Create matrix wt 3 columns
colnames(xVal)<- paste0("x",1:p)      # Name columns
df <- data.frame(xVal)                # Create dataframe 
# Make x1 a useful predictor of y:
y <- df$x1 + e
df$y <- y

看起来像这样：

head(df,3)
x1         x2          x3            y
1 -0.6512695  0.3639012 -0.50231648 -0.296679882
2 -1.1393367 -0.8148882  0.33065078 -2.703743889
3 -0.2674592 -0.2670326 -0.15028117  1.024109832

其中 x1 是 y 的有用预测因子，x2 和 x3 只是随机噪声。然后 Id 只是将随机森林回归模型拟合到数据中。

我将如何实现类似的分类？

x1 = c(rnorm(500, 0,1), rnorm(500,3,1))
x2 = rnorm(1000)
x3 = rnorm(1000)
class= factor(rep(1:2, each=500))
plot(x1,x2, pch=20, col=class)

x1是class的有用预测因子。x2和x3只是噪音。

相关内容

最新更新

热门标签：