myeclipse下java调用weka

代码示例

package test;

import java.io.File;
import weka.classifiers.Classifier;
import weka.classifiers.trees.J48;
import weka.core.Instances;
import weka.core.converters.ArffLoader;

public class WekaTest {
	public static void main(String[] args) throws Exception {
		Classifier m_classifier = new J48();
		// 训练语料文件
		File inputFile = new File("D:/Program Files/Weka-3-6/data/cpu.with.vendor.arff");
		ArffLoader atf = new ArffLoader();
		atf.setFile(inputFile);
		// 读入训练文件
		Instances instancesTrain = atf.getDataSet();
		instancesTrain.setClassIndex(0);
		// 训练
		m_classifier.buildClassifier(instancesTrain);

		// 测试语料文件
		inputFile = new File("D:/Program Files/Weka-3-6/data/cpu.with.vendor.arff");
		atf.setFile(inputFile);
		// 读入测试文件
		Instances instancesTest = atf.getDataSet();
		// 设置分类属性所在行号(第一行为0号),instancesTest.numAttributes()可以取得属性总数
		instancesTest.setClassIndex(0);

		// 测试语料实例数
		double sum = instancesTest.numInstances();
		double right = 0.0f;
		// 测试分类结果
		for (int i = 0; i < sum; i++) {
			// 如果预测值和答案值相等(测试语料中的分类列提供的须为正确答案,结果才有意义)
			if (m_classifier.classifyInstance(instancesTest.instance(i)) == instancesTest.instance(i).classValue()) {
				// 正确值加1
				right++;
			}
		}
		System.out.println("J48 classification precision:" + (right / sum));
	}
}

操作步骤

  1. 新建一个java project,创建类WekaTest
  2. 引入weka.jar包(weka安装目录D:\Program Files\Weka-3-6\weka.jar)

问题

调用过程顺利,但是结果与在weka中得出的结果不同,贴出图,求明白人指点

程序运行结果

J48 classification precision:0.8373205741626795

WEKA运行结果:


=== Run information ===

Scheme:weka.classifiers.trees.J48 -C 0.25 -M 2
Relation:     bank-data-weka.filters.unsupervised.attribute.Remove-R1
Instances:    600
Attributes:   11
              age
              sex
              region
              income
              married
              children
              car
              save_act
              current_act
              mortgage
              pep
Test mode:evaluate on training data

=== Classifier model (full training set) ===

J48 pruned tree
------------------

children <= 1
|   children <= 0
|   |   married = NO
|   |   |   mortgage = NO: YES (48.0/3.0)
|   |   |   mortgage = YES
|   |   |   |   save_act = NO: YES (12.0)
|   |   |   |   save_act = YES: NO (23.0)
|   |   married = YES
|   |   |   save_act = NO
|   |   |   |   mortgage = NO
|   |   |   |   |   income <= 21506.2
|   |   |   |   |   |   age <= 41: NO (11.0/1.0)
|   |   |   |   |   |   age > 41: YES (5.0/1.0)
|   |   |   |   |   income > 21506.2: NO (20.0)
|   |   |   |   mortgage = YES: YES (25.0/3.0)
|   |   |   save_act = YES: NO (119.0/12.0)
|   children > 0
|   |   income <= 15538.8
|   |   |   age <= 41: NO (22.0/2.0)
|   |   |   age > 41: YES (2.0)
|   |   income > 15538.8: YES (111.0/5.0)
children > 1
|   income <= 30404.3: NO (124.0/12.0)
|   income > 30404.3
|   |   children <= 2: YES (51.0/5.0)
|   |   children > 2
|   |   |   income <= 44288.3: NO (19.0/2.0)
|   |   |   income > 44288.3: YES (8.0)

Number of Leaves  : 	15

Size of the tree : 	29


Time taken to build model: 0.01 seconds

=== Evaluation on training set ===
=== Summary ===

Correctly Classified Instances         554               92.3333 %
Incorrectly Classified Instances        46                7.6667 %
Kappa statistic                          0.845 
K&B Relative Info Score              45010.1705 %
K&B Information Score                  447.6762 bits      0.7461 bits/instance
Class complexity | order 0             596.7451 bits      0.9946 bits/instance
Class complexity | scheme              222.7757 bits      0.3713 bits/instance
Complexity improvement     (Sf)        373.9693 bits      0.6233 bits/instance
Mean absolute error                      0.1389
Root mean squared error                  0.2636
Relative absolute error                 27.9979 %
Root relative squared error             52.9137 %
Total Number of Instances              600     

=== Detailed Accuracy By Class ===

               TP Rate   FP Rate   Precision   Recall  F-Measure   ROC Area  Class
                 0.894     0.052      0.935     0.894     0.914      0.936    YES
                 0.948     0.106      0.914     0.948     0.931      0.936    NO
Weighted Avg.    0.923     0.081      0.924     0.923     0.923      0.936

=== Confusion Matrix ===

   a   b   <-- classified as
 245  29 |   a = YES
  17 309 |   b = NO

 

quote:http://blog.csdn.net/felomeng/article/details/4688257#comments

     

你可能感兴趣的:(MyEclipse,weka,J48)