Utility Functions
=================
To use utility functions, type:
>>> from svmutil import *
The above command loads:
svm_train() : train an SVM model
svm_predict() : predict testing data
svm_read_problem() : read the data from a LIBSVM-format file.
svm_load_model() : load a LIBSVM model.
svm_save_model() : save model to a file.
evaluations() : evaluate prediction results.
csr_find_scale_param() : find scaling parameter for data in csr format(查找csr格式数据的缩放参数).
csr_scale() : apply data scaling to data in csr format(对csr格式的数据应用数据缩放).
## 第一个function
- ***Function: svm_train***
There are three ways to call svm_train()
>>> model = svm_train(y, x [, 'training_options'])
>>> model = svm_train(prob [, 'training_options'])
>>> model = svm_train(prob, param)
y: a list/tuple/ndarray of l training labels (type must be int/double).
x: 1. a list/tuple of l training instances. Feature vector of each training instance is a list/tuple or dictionary.
2. an l * n numpy ndarray or scipy spmatrix (n: number of features).
training_options: a string in the same form as that for LIBSVM command mode.
prob: an svm_problem instance generated by calling
svm_problem(y, x).
For pre-computed kernel, you should use
svm_problem(y, x, isKernel=True)
param: an svm_parameter instance generated by calling
svm_parameter('training_options')
model: the returned svm_model instance. See svm.h for details of this structure. If '-v' is specified, cross validation is
conducted and the returned model is just a scalar: cross-validation accuracy for classification and mean-squared error for regression.
To train the same data many times with different parameters, the second and the third ways should be faster..
Examples:
>>> y, x = svm_read_problem('../heart_scale')
>>> prob = svm_problem(y, x)
>>> param = svm_parameter('-s 3 -c 5 -h 0')
>>> m = svm_train(y, x, '-c 5')
>>> m = svm_train(prob, '-t 2 -c 5')
>>> m = svm_train(prob, param)
>>> CV_ACC = svm_train(y, x, '-v 3')
## 第二个function
***- Function: svm_predict***
To predict testing data with a model, use
>>> p_labs, p_acc, p_vals = svm_predict(y, x, model [,'predicting_options'])
y: a list/tuple/ndarray of l true labels (type must be int/double).
It is used for calculating the accuracy. Use [] if true labels are unavailable.
x: 1. a list/tuple of l training instances. Feature vector of each training instance is a list/tuple or dictionary.
2. an l * n numpy ndarray or scipy spmatrix (n: number of features).
predicting_options: a string of predicting options in the same format as that of LIBSVM.
model: an svm_model instance.
p_labels: a list of predicted labels
p_acc: a tuple including accuracy (for classification), mean squared error, and squared correlation coefficient (for regression)(包括准确度(用于分类)、均方误差和平方相关系数(用于回归)的元组).
p_vals: a list of decision values or probability estimates (if '-b 1' is specified). If k is the number of classes in training data, for decision values, each element includes results of predicting k(k-1)/2 binary-class SVMs. For classification, k = 1 is a special case. Decision value [+1] is returned for each testing instance, instead of an empty list.
For probabilities, each element contains k values indicating the probability that the testing instance is in each class. Note that the order of classes is the same as the 'model.label' field in the model structure.
Example:
>>> m = svm_train(y, x, '-c 5')
>>> p_labels, p_acc, p_vals = svm_predict(y, x, m)
## 第三组functions
***- Functions: svm_read_problem/svm_load_model/svm_save_model***
See the usage by examples:
>>> y, x = svm_read_problem('data.txt')
>>> m = svm_load_model('model_file')
>>> svm_save_model('model_file', m)
## 第四个functions
***- Function: evaluations***
Calculate some evaluations using the true values (ty) and the predicted values (pv):
>>> (ACC, MSE, SCC) = evaluations(ty, pv, useScipy)
ty: a list/tuple/ndarray of true values.
pv: a list/tuple/ndarray of predicted values.
useScipy: convert ty, pv to ndarray, and use scipy functions to do the evaluation
ACC: accuracy(准确度).
MSE: mean squared error(均方误差).
SCC: squared correlation coefficient(平方相关系数).
## 第五组functions
***- Function: csr_find_scale_parameter/csr_scale***
Scale data in csr format.
>>> param = csr_find_scale_param(x [, lower=l, upper=u])
>>> x = csr_scale(x, param)
x: a csr_matrix of data.
l: x scaling lower limit; default -1.(缩放下限,默认-1)
u: x scaling upper limit; default 1.(缩放上限,默认1)
The scaling process is: x * diag(coef) + ones(l, 1) * offset'
param: a dictionary of scaling parameters, where param['coef'] = coef and param['offset'] = offset.
coef: a scipy array of scaling coefficients(系数).
offset: a scipy array of scaling offsets(偏移).
Additional Information
======================
This interface was written by Hsiang-Fu Yu from Department of Computer Science, National Taiwan University. If you find this tool useful, please cite LIBSVM as follows Chih-Chung Chang and Chih-Jen Lin, LIBSVM : a library for support vector machines. ACM Transactions on Intelligent Systems and
Technology, 2:27:1--27:27, 2011. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
For any question, please contact Chih-Jen Lin