我有一个训练数据集和一个测试数据集。我正在使用weka资源管理器,试图创建一个随机森林(算法)模型。在创建模型后,当我使用我的测试集数据通过(提供测试集/重新评估当前数据集)选项卡来实现它时,它显示了类似的东西。
我做错了什么?
培训模式:
=== Evaluation on training set ===
Time taken to test model on training data: 0.24 seconds
=== Summary ===
Correctly Classified Instances 5243 98.9245 %
Incorrectly Classified Instances 57 1.0755 %
Kappa statistic 0.9439
Mean absolute error 0.0453
Root mean squared error 0.1137
Relative absolute error 23.2184 %
Root relative squared error 36.4074 %
Coverage of cases (0.95 level) 100 %
Mean rel. region size (0.95 level) 59.3019 %
Total Number of Instances 5300
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall F-Measure MCC ROC Area PRC Area Class
0.996 0.067 0.992 0.996 0.994 0.944 0.999 1.000 0
0.933 0.004 0.968 0.933 0.950 0.944 0.999 0.990 1
Weighted Avg. 0.989 0.060 0.989 0.989 0.989 0.944 0.999 0.999
=== Confusion Matrix ===
a b <-- classified as
4702 18 | a = 0
39 541 | b = 1
模型在我的测试数据集上实现:
=== Evaluation on test set ===
Time taken to test model on supplied test set: 0.22 seconds
=== Summary ===
Total Number of Instances 0
Ignored Class Unknown Instances 4000
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall F-Measure MCC ROC Area PRC Area Class
0.000 0.000 0.000 0.000 0.000 0.000 ? ? 0
0.000 0.000 0.000 0.000 0.000 0.000 ? ? 1
Weighted Avg. NaN NaN NaN NaN NaN NaN NaN NaN
=== Confusion Matrix ===
a b <-- classified as
0 0 | a = 0
0 0 | b = 1
您的测试数据集似乎没有标签。
你只能使用标记数据来评估你的预测质量。