获取MLR中测试集的预测



我正在使用R中的MLR包为二进制问题拟合分类模型。对于每个模型,我使用"selectFeatures"函数与嵌入特征选择进行交叉验证,并检索测试集的平均AUC。接下来,我想检索每个折叠的测试集上的预测,但这个函数似乎不支持这一点。我已经尝试将选定的预测因子插入"重采样"函数中以获得它。它很有效,但性能指标不同,不适合我的分析。如果可能的话,我也试着签入插入符号包,但我第一眼还没有看到解决方案。知道怎么做吗?

这是我使用合成数据的代码,以及我尝试使用"重采样"函数的代码(同样:由于性能指标不同,在当前版本中不适用(。

# 1. Find a synthetic dataset for supervised learning (two classes)
###################################################################
install.packages("mlbench")
library(mlbench)
data(BreastCancer)
# generate 1000 rows, 21 quantitative candidate predictors and 1 target variable 
p<-mlbench.waveform(1000) 
# convert list into dataframe
dataset<-as.data.frame(p)
# drop thrid class to get 2 classes
dataset2  = subset(dataset, classes != 3)
# 2. Perform cross validation with embedded feature selection
#############################################################
library(BBmisc)
library(nnet)
library(mlr)
# Choice of algorithm i.e. neural network
mL <- makeLearner("classif.nnet", predict.type = "prob")
# Choice of sampling plan: 10 fold cross validation with stratification of target classes 
mRD = makeResampleDesc("CV", iters = 10,stratify = TRUE)
# Choice of feature selection strategy   
ctrl = makeFeatSelControlSequential(method = "sffs", maxit = NA,alpha = 0.001)
# Choice of feature selection technique (stepwize family) and p-value 
mFSCS = makeFeatSelControlSequential(method = "sffs", maxit = NA,alpha = 0.001)
# Choice of seed 
set.seed(12)
# Choice of data 
mCT <- makeClassifTask(data =dataset2, target = "classes")
# Perform the method
result = selectFeatures(mL,mCT, mRD, control = ctrl, measures = list(mlr::auc,mlr::acc,mlr::brier))
# Retrieve AUC and selected variables
analyzeFeatSelResult(result)
# Result: auc.test.mean=0.9614525 Variables selected: x.10, x.11, x.15, x.17, x.18    
# 3. Retrieve predictions on tests sets (to later perform Delong tests on AUCs derived from multiple sets of candidate variables)
#################################################################################################################################
# create new dataset with selected predictors
keep <- c("x.10","x.11","x.15","x.17","x.18","classes")
dataset3 <- dataset2[ , names(dataset2) %in% keep]
# Perform same tasks with  resample function instead of selectFeatures function to get predictions on tests set
mL <- makeLearner("classif.nnet", predict.type = "prob")   
ctrl = makeFeatSelControlSequential(method = "sffs", maxit = NA,alpha = 0.001)
mRD = makeResampleDesc("CV", iters = 10,stratify = TRUE)
set.seed(12)
mCT <- makeClassifTask(data =dataset3, target = "classes")
r1r = resample(mL, mCT, mRD, measures = list(mlr::auc,mlr::acc,mlr::brier))
# Result: auc.test.mean=0.9673023
代码中缺少

ctrl

要获得重采样对象的预测,只需使用getRRPredictions(r1r)r1r$measures.test

相关内容

  • 没有找到相关文章

最新更新