Rpart R决策树评分



在使用SkLearn的Python中,您可以使用以下代码来创建和接收决策树的分数:

tr = tree.DecisionTreeClassifier(random_state=rseed, min_samples_split=2, ccp_alpha=0.005)
model_tree = tr.fit(train_features, train_outputs)
print(f'Model Train Accuracy: {model_tree.score(train_features, train_outputs)}')
print(f'Model Test Accuracy: {model_tree.score(test_features, test_outputs)}')

以上产生

Model Train Accuracy: 0.5942
Model Test Accuracy: 0.4933

如何使用R的Rpart在R中获得类似的分数(在训练和测试数据上)?

总之:

  1. 计算错误率如下图
  2. 确保在python和R中使用相同的参数和控制参数(参见https://www.rdocumentation.org/packages/rpart/versions/4.1-15/topics/rpart.control)
model_tree <- rpart(Response ~ Predictor1 + PredictorX,
data = train, method = "class",
control = list(cp = 0.005, minsplit = 2, ...))
pred_train <- predict(model_tree, type = "class")
pred_test <- predict(model_tree, newdata = test, type = "class")
# error rate / accuracy (train set)
mean(pred_train != train$Response)
# error rate / accuracy (test set)
mean(pred_test != test$Response)

最新更新