谁能帮我在Weka中评估测试集数据?



我有一个训练数据集和一个测试数据集。我正在使用weka资源管理器,试图创建一个随机森林(算法)模型。在创建模型后,当我使用我的测试集数据通过(提供测试集/重新评估当前数据集)选项卡来实现它时,它显示了类似的东西。

我做错了什么?

培训模式:

=== Evaluation on training set ===
Time taken to test model on training data: 0.24 seconds
=== Summary ===
Correctly Classified Instances        5243               98.9245 %
Incorrectly Classified Instances        57                1.0755 %
Kappa statistic                          0.9439
Mean absolute error                      0.0453
Root mean squared error                  0.1137
Relative absolute error                 23.2184 %
Root relative squared error             36.4074 %
Coverage of cases (0.95 level)         100      %
Mean rel. region size (0.95 level)      59.3019 %
Total Number of Instances             5300     
=== Detailed Accuracy By Class ===
                 TP Rate  FP Rate  Precision  Recall   F-Measure  MCC      ROC   Area  PRC Area  Class
                 0.996    0.067    0.992      0.996    0.994      0.944    0.999     1.000     0
                 0.933    0.004    0.968      0.933    0.950      0.944    0.999     0.990     1
Weighted Avg.    0.989    0.060    0.989      0.989    0.989      0.944    0.999     0.999     
=== Confusion Matrix ===
    a    b   <-- classified as
 4702   18 |    a = 0
   39  541 |    b = 1

模型在我的测试数据集上实现:

=== Evaluation on test set ===
Time taken to test model on supplied test set: 0.22 seconds
=== Summary ===
Total Number of Instances                0     
Ignored Class Unknown Instances               4000     
=== Detailed Accuracy By Class ===
                 TP Rate  FP Rate  Precision  Recall   F-Measure  MCC      ROC  Area  PRC Area  Class
                 0.000    0.000    0.000      0.000    0.000      0.000    ?         ?         0
                 0.000    0.000    0.000      0.000    0.000      0.000    ?         ?         1
Weighted Avg.    NaN      NaN      NaN        NaN      NaN        NaN      NaN       NaN       
=== Confusion Matrix ===
 a b   <-- classified as
 0 0 | a = 0
 0 0 | b = 1

您的测试数据集似乎没有标签。

你只能使用标记数据来评估你的预测质量。

最新更新