r-我怎样才能正确地预测这个Rf模型

  • 本文关键字:Rf 模型 正确地 r caret
  • 更新时间 :
  • 英文 :


使用此命令对某些数据使用随机林分类器。(类型(字段为0和1,带有一些NA,因此NA作用

Websites <- read.csv("malicious_and_benign_websites12.csv")
datasplit = sort(sample(nrow(Websites), nrow(Websites)*.8))
train<-Websites[datasplit,]
test<-Websites[-datasplit,]
install.packages("caret") 
library(caret)
RF_model <- train(as.factor(Type) ~ .,
data = train
method = 'ranger'
na.action = na.exclude
)
RF_model

经过一段时间的训练,这一切都奏效了,但后来我需要预测模型,并使用这个命令构建一个混淆矩阵。

datasplitRFPred <- predict(RF_model, test)
confusionMatrix(datasplitRFPred, as.factor(test$Type)

当我得到这个错误时,麻烦就出现了

> datasplitRFPred <- predict(RF_model, test)
Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = 
object$xlevels) : 
factor URL has new levels B0_10, B0_101, B0_1016, B0_1019, B0_1028, B0_1057, B0_1088, B0_1093, B0_1102, B0_1113, B0_1146, B0_1160, B0_1161, B0_120, B0_1206, B0_1211, B0_123, B0_1233, B0_1257, B0_126, B0_1260, B0_1279, B0_1296, B0_1300, B0_1302, B0_1304, B0_1317, B0_1330, B0_1340, B0_1347, B0_1366, B0_1369, B0_1378, B0_1386, B0_1387, B0_1392, B0_1394, B0_1403, B0_1404, B0_1410, B0_1412, B0_1419, B0_157, B0_158, B0_168, B0_181, B0_2011, B0_2024, B0_203, B0_2051, B0_206, B0_207, B0_2072, B0_2075, B0_2077, B0_2111, B0_2112, B0_2116, B0_2119, B0_2122, B0_2130, B0_2153, B0_2159, B0_216, B0_2168, B0_2169, B0_2221, B0_2228, B0_2235, B0_2236, B0_2241, B0_2273, B0_2282, B0_2287, B0_2309, B0_235, B0_237, B0_244, B0_254, B0_28, B0_281, B0_289, B0_296, B0_307, B0_312, B0_314, B0_331, B0_334, B0_335, B0_34, B0_341, B0_343, B0_348, B0_354, B0_36, B0_408, B0_421, B0_422, B0_429, B0_438, B0_444, B0_447, B0_46, B0_471, B0_497, B0_518, B0_529, B0_531, B0_533, B0_535, B0_536, B0_537, B0_554, B0_5
confusionMatrix(datasplitRFPred, as.factor(test$Type)

无论如何都要解决这个问题???

无法制作矩阵并获得性能指标。

我会在Website数据集中预先将Type转换为factor。您在train()中转换训练数据,在confusionMatrix()中转换测试数据,但随后在不进行因子转换的情况下使用predict()中的原始训练数据。

最新更新