在做模型训练的时候,尤其是在训练集上做交叉验证,通常想要将模型保存下来,然后放到独立的测试集上测试,下面介绍的是Python中训练模型的保存和再使用。
scikit-learn已经有了模型持久化的操作,导入joblib即可
from sklearn.externals import joblib
from sklearn.externals import joblib
x = [[2,3,1],[4,6,3]]
y = [1,0]
from sklearn.svm import SVC
clf = SVC()
clf.fit(x,y)
Out[252]:
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
decision_function_shape='ovr', degree=3, gamma='auto', kernel='rbf',
max_iter=-1, probability=False, random_state=None, shrinking=True,
tol=0.001, verbose=False)
clf.coef0
Out[253]: 0.0
joblib.dump(clf,'c:/users/yingfei-wjc/desktop/train_model.m')
Out[255]: ['c:/users/yingfei-wjc/desktop/train_model.m']
通过joblib的dump可以将模型保存到本地,clf是训练的分类器
model = joblib.load("c:/users/yingfei-wjc/desktop/train_model.m")
model
Out[257]:
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
decision_function_shape='ovr', degree=3, gamma='auto', kernel='rbf',
max_iter=-1, probability=False, random_state=None, shrinking=True,
tol=0.001, verbose=False)
通过joblib的load方法,加载保存的模型,然后就可以在测试集上测试了
model.predict([[5,5,3],[2,2,1]])
Out[258]: array([0, 1])