如何使用 weka 在 java 中的文档分类中传递动态测试实例



我是weka的新手。目前,我正在使用weka和java进行文本分类。我的训练数据集有一个字符串属性和一个类属性。

@RELATION test
@ATTRIBUTE tweet string
@ATTRIBUTE class {positive,negative}

我想动态创建一个测试时刻,并使用朴素贝叶斯分类器对其进行分类。

public static void main(String[] args) throws FileNotFoundException, IOException, Exception {
StringToWordVector filter = new StringToWordVector();
//training set
BufferedReader reader = null;
reader = new BufferedReader(new FileReader("D:/suicideTest.arff"));
Instances train = new Instances(reader);
train.setClassIndex(train.numAttributes() -1);
filter.setInputFormat(train);
train = Filter.useFilter(train, filter);
reader.close();

Attribute tweet = new Attribute("tweet");
FastVector classVal = new FastVector(2);
classVal.addElement("positive");
classVal.addElement("negative");

FastVector testAttributes = new FastVector(2);
testAttributes.addElement(tweet);
testAttributes.addElement(classVal);
Instance testcase;
testcase = null;
testcase.setValue(tweet,"Hello my world");
testcase.setValue((Attribute)testAttributes.elementAt(1),"?");
Instances test = null;
test.add(testcase);
test = Filter.useFilter(test, filter);
NaiveBayes nb = new NaiveBayes();
nb.buildClassifier(train);
Evaluation eval = new Evaluation(train);
eval.crossValidateModel(nb, train, 10,new Random(1));

double pred = nb.classifyInstance(test.instance(0));
System.out.println("the result is   "+ pred);
}

我已经遵循了之前的问题 如何在 Weka 中测试由用户输入的单个测试用例?.

但是当我尝试将值设置为测试实例时,我仍然得到和java.lang.NullPointerException,

testcase.setValue(tweet,"Hello my world"(;

此代码工作正常。 可以创建实例,

Instances testSet = new Instances("", allAtt, 1);
double pred = nb.classifyInstance(testSet.instance(0));

并将一个实例传递给分类器,

public static void main(String[] args) throws Exception{
StringToWordVector filter = new StringToWordVector();
//training set
BufferedReader reader;
reader = new BufferedReader(new FileReader("D:/test.arff"));
Instances train = new Instances(reader);
train.setClassIndex(train.numAttributes() -1);
filter.setInputFormat(train);
train = Filter.useFilter(train, filter);

reader.close();
NaiveBayes nb = new NaiveBayes();
nb.buildClassifier(train);

ArrayList cls = new ArrayList(2);
cls.add("negative"); 
cls.add("positive");

Attribute clsAtt = new Attribute("class", cls);
//ArrayList<String> tweet = new ArrayList(1);
//String tweet = "";
//Attribute tweetAtt = new Attribute("tweet", tweet);
ArrayList allAtt = new ArrayList(2); 
//allAtt.add(tweetAtt);
allAtt.add(new Attribute("tweet", (FastVector) null));
allAtt.add(clsAtt);

// Create an empty test set
Instances testSet = new Instances("", allAtt, 1);
// Set class index
testSet.setClassIndex(testSet.numAttributes() - 1);
String names=  "I want to suiceide";
Instance inst = new DenseInstance(2); 
inst.setValue((Attribute)allAtt.get(0), names.toString());
testSet.add(inst);
System.out.println(testSet.instance(0).toString());
double pred = nb.classifyInstance(testSet.instance(0));
filter.setInputFormat(testSet);
testSet = Filter.useFilter(testSet, filter);
String predictString = testSet.classAttribute().value((int) pred);


}

最新更新