r-如何从随机森林模型中创建精度召回曲线



我正试图从仅基于训练数据的随机森林模型中创建精度召回曲线。这与这个问题类似,但我不知道创建PR曲线的代码。请参阅下面我的可复制示例(经过修改,使其与我的个人数据集相匹配(:

#Load beaver2
View(beaver2)
#convert outcome into factor
beaver2 <- beaver2 %>% mutate(activ = ifelse(activ==0, "no","yes"))
#convert outcome to factor
beaver2$activ <- as.factor(beaver2$activ)
#create trControl
data_ctrl_null <- trainControl(method="cv", number = 5, classProbs = TRUE, summaryFunction=twoClassSummary, savePredictions=T, sampling=NULL)
#create rf model
rf_model <- train(activ ~ ., data=beaver2, trControl = data_ctrl_null, method= "rf", preProc=c("center","scale"),metric="ROC", importance=TRUE)
#create precision recall curve
library("PRROC")

我想使用PRROC软件包。如何从随机森林模型中获取预测并创建PR曲线?笔记我想在我的训练数据上创建预测;所以想象一下,没有测试数据来进行预测。非常感谢您的帮助!

#Load beaver2
View(beaver2)
library(dplyr)
library(caret)
#convert outcome into factor
beaver2 <- beaver2 %>% mutate(activ = ifelse(activ==0, "no","yes"))
#convert outcome to factor
beaver2$activ <- as.factor(beaver2$activ)
#create trControl
data_ctrl_null <- trainControl(method="cv", number = 5, classProbs = TRUE, summaryFunction=twoClassSummary, savePredictions=T, sampling=NULL)
#create rf model
rf_model <- train(activ ~ ., data=beaver2, trControl = data_ctrl_null, method= "rf", preProc=c("center","scale"),metric="ROC", importance=TRUE)
# predict using train data
predictions <- predict.train(rf_model)
test_data <- beaver2 %>% select(-activ) #instead of train data, use unseen test data here.
predictions <- predict(object = rf_model, newdata = test_data)
#add some artificial wrong predictions, otherwise perfect prediction, since we use train data as test
predictions[1] <- 'yes'
predictions[18] <- 'yes'
predictions[60] <- 'no'
predictions[61] <- 'no'
predictions[100] <- 'no'
confusion_Matrix <- table(Predictions = predictions, Reference = beaver2$activ)
#create precision recall curve
library("PRROC")
fg <- predictions[beaver2$activ == 'yes']
bg <- predictions[beaver2$activ == 'no']
pr <- pr.curve(scores.class0 = fg, scores.class1 = bg, curve = T)
plot(pr)

另请参阅此处的回复:https://stats.stackexchange.com/questions/10501/calculating-aupr-in-r

如果您不喜欢PROC包,我强烈建议您使用MLeval包中的evalm函数。它与插入符号配合使用非常好且简单。

在你的情况下,你所需要做的就是

x <- evalm(rf_model )
x$roc #roc curve
x$stdres #model specs
x$cc #calibration plot

用于您的列车数据集。此代码适用于您的测试数据集:

test <- evalm(data.frame(pred, test.data$outcome))
test$roc
test$stdres
test$cc

最新更新