最近做毕业设计要用libsvm作为transfer learning algorithm的baseline,数据给的都是mat格式,自然要用matlab来实现,所以这两天都在熟悉这部分的内容。
安装过程:
在官网fork一份代码,编译matlab文件夹下make.m,得到4个mex文件,对train和predict重命名为libtrain,libpredict,防止与matlab内置svm函数重名。
得到mex文件后将matlab目录加入path内
svmpath='.\libsvm-3.22\matlab';
addpath(svmpath);
就可以通过函数调用使用libsvm
使用说明:
以下将GitHub内的使用文档里面的重点挑出来,方便查阅:
----------------------------------------- --- MATLAB/OCTAVE interface of LIBSVM --- ----------------------------------------- Usage ===== 此处注意matlab中调用,所有数据均需为double型
训练model: matlab> model = svmtrain(training_label_vector, training_instance_matrix [, 'libsvm_options']); -training_label_vector: An m by 1 vector of training labels (type must be double). -training_instance_matrix: An m by n matrix of m training instances with n features. It can be dense or sparse (type must be double). -libsvm_options: A string of training options in the same format as that of LIBSVM.
options: -s svm_type : set type of SVM (default 0) 0 -- C-SVC (multi-class classification) 1 -- nu-SVC (multi-class classification) 2 -- one-class SVM 3 -- epsilon-SVR (regression) 4 -- nu-SVR (regression) -t kernel_type : set type of kernel function (default 2) 0 -- linear: u'*v 1 -- polynomial: (gamma*u'*v + coef0)^degree 2 -- radial basis function: exp(-gamma*|u-v|^2) 3 -- sigmoid: tanh(gamma*u'*v + coef0) 4 -- precomputed kernel (kernel values in training_set_file) -d degree : set degree in kernel function (default 3) -g gamma : set gamma in kernel function (default 1/num_features) -r coef0 : set coef0 in kernel function (default 0) -c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1) -n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5) -p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1) -m cachesize : set cache memory size in MB (default 100) -e epsilon : set tolerance of termination criterion (default 0.001) -h shrinking : whether to use the shrinking heuristics, 0 or 1 (default 1) -b probability_estimates : whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0) -wi weight : set the parameter C of class i to weight*C, for C-SVC (default 1) -v n: n-fold cross validation mode -q : quiet mode (no outputs)
使用model预测:
matlab> [predicted_label, accuracy, decision_values/prob_estimates] = svmpredict(testing_label_vector, testing_instance_matrix, model [, 'libsvm_options']);
只需要预测label时: matlab> [predicted_label] = svmpredict(testing_label_vector, testing_instance_matrix, model [, 'libsvm_options']);
未知testing label情况下使用随机值即可 -testing_label_vector: An m by 1 vector of prediction labels. If labels of test data are unknown, simply use any random values. (type must be double) -testing_instance_matrix: An m by n matrix of m testing instances with n features. It can be dense or sparse. (type must be double) -model: The output of svmtrain. -libsvm_options: A string of testing options in the same format as that of LIBSVM.
返回的model结构: Returned Model Structure ======================== The 'svmtrain' function returns a model which can be used for future prediction. It is a structure and is organized as [Parameters, nr_class, totalSV, rho, Label, ProbA, ProbB, nSV, sv_coef, SVs]: -Parameters: parameters -nr_class: number of classes; = 2 for regression/one-class svm -totalSV: total #SV -rho: -b of the decision function(s) wx+b -Label: label of each class; empty for regression/one-class SVM -sv_indices: values in [1,...,num_traning_data] to indicate SVs in the training set -ProbA: pairwise probability information; empty if -b 0 or in one-class SVM -ProbB: pairwise probability information; empty if -b 0 or in one-class SVM -nSV: number of SVs for each class; empty for regression/one-class SVM -sv_coef: coefficients for SVs in decision functions -SVs: support vectors If you do not use the option '-b 1', ProbA and ProbB are empty matrices. If the '-v' option is specified, cross validation is conducted and the returned model is just a scalar: cross-validation accuracy for classification and mean-squared error for regression. More details about this model can be found in LIBSVM FAQ (http://www.csie.ntu.edu.tw/~cjlin/libsvm/faq.html) and LIBSVM implementation document (http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf).
预测结果: Result of Prediction ==================== The function 'svmpredict' has three outputs.
The first one,predictd_label, is a vector of predicted labels.
第一个输出为预测标签
The second output,accuracy, is a vector including accuracy (for classification), mean squared error, and squared correlation coefficient (for regression).
第二个输出为准确率[分类准确率,均方差,平方相关系数(对于回归)] The third is a matrix containing decision values or probability estimates (if '-b 1' is specified). If k is the number of classes in training data, for decision values, each row includes results of predicting k(k-1)/2 binary-class SVMs. For classification, k = 1 is a special case. Decision value +1 is returned for each testing instance,instead of an empty vector. For probabilities, each row contains k values indicating the probability that the testing instance is in each class.Note that the order of classes here is the same as 'Label' field in the model structure.
第三个是包含概率估计决策值的矩阵,是svm计算过程中的中间量。
Other Utilities =============== A matlab function libsvmread reads files in LIBSVM format: 可以用libsvm的方式读libsvm格式的文件 [label_vector, instance_matrix] = libsvmread('data.txt'); Two outputs are labels and instances, which can then be used as inputs of svmtrain or svmpredict. A matlab function libsvmwrite writes Matlab matrix to a file in LIBSVM format: libsvmwrite('data.txt', label_vector, instance_matrix) The instance_matrix must be a sparse matrix. (type must be double) For 32bit and 64bit MATLAB on Windows, pre-built binary files are ready in the directory `..\windows', but in future releases, we will only include 64bit MATLAB binary files. These codes are prepared by Rong-En Fan and Kai-Wei Chang from National Taiwan University.
Examples ======== Train and test on the provided data heart_scale: matlab> [heart_scale_label, heart_scale_inst] = libsvmread('../heart_scale'); matlab> model = svmtrain(heart_scale_label, heart_scale_inst, '-c 1 -g 0.07'); matlab> [predict_label, accuracy, dec_values] = svmpredict(heart_scale_label, heart_scale_inst, model); % test the training data For probability estimates, you need '-b 1' for training and testing: matlab> [heart_scale_label, heart_scale_inst] = libsvmread('../heart_scale'); matlab> model = svmtrain(heart_scale_label, heart_scale_inst, '-c 1 -g 0.07 -b 1'); matlab> [heart_scale_label, heart_scale_inst] = libsvmread('../heart_scale'); matlab> [predict_label, accuracy, prob_estimates] = svmpredict(heart_scale_label, heart_scale_inst, model, '-b 1'); To use precomputed kernel, you must include sample serial number as the first column of the training and testing data (assume your kernel matrix is K, # of instances is n): matlab> K1 = [(1:n)', K]; % include sample serial number as first column matlab> model = svmtrain(label_vector, K1, '-t 4'); matlab> [predict_label, accuracy, dec_values] = svmpredict(label_vector, K1, model); % test the training data We give the following detailed example by splitting heart_scale into 150 training and 120 testing data. Constructing a linear kernel matrix and then using the precomputed kernel gives exactly the same testing error as using the LIBSVM built-in linear kernel. matlab> [heart_scale_label, heart_scale_inst] = libsvmread('../heart_scale'); matlab> matlab> % Split Data matlab> train_data = heart_scale_inst(1:150,:); matlab> train_label = heart_scale_label(1:150,:); matlab> test_data = heart_scale_inst(151:270,:); matlab> test_label = heart_scale_label(151:270,:); matlab> matlab> % Linear Kernel matlab> model_linear = svmtrain(train_label, train_data, '-t 0'); matlab> [predict_label_L, accuracy_L, dec_values_L] = svmpredict(test_label, test_data, model_linear); matlab> matlab> % Precomputed Kernel matlab> model_precomputed = svmtrain(train_label, [(1:150)', train_data*train_data'], '-t 4'); matlab> [predict_label_P, accuracy_P, dec_values_P] = svmpredict(test_label, [(1:120)', test_data*train_data'], model_precomputed); matlab> matlab> accuracy_L % Display the accuracy using linear kernel matlab> accuracy_P % Display the accuracy using precomputed kernel Note that for testing, you can put anything in the testing_label_vector. For more details of precomputed kernels, please read the section ``Precomputed Kernels'' in the README of the LIBSVM package.
Additional Information ====================== See More: For any question, please contact Chih-Jen Lin, or check the FAQ page: http://www.csie.ntu.edu.tw/~cjlin/libsvm/faq.html#/Q10:_MATLAB_interface