http://blog.csdn.net/pipisorry/article/details/52268947
Grid Search: Searching for estimator parameters
scikit-learn中提供了pipeline(for estimator connection) & grid_search(searching best parameters)进行并行调参
如使用scikit-learn做文本分类时:vectorizer取多少个word呢?预处理时候要过滤掉tf>max_df的words,max_df设多少呢?tfidftransformer只用tf还是加idf呢?classifier分类时迭代几次?学习率怎么设? “循环一个个试”,这就是grid search要做的基本东西。
某小皮
Hyper-parameters are parameters that are not directly learnt within estimators.In scikit-learn they are passed as arguments to the constructor of theestimator classes.
It is possible and recommended to search the hyper-parameter space for the best Cross-validation: evaluating estimator performance score.
Any parameter provided when constructing an estimator may be optimized in thismanner. Specifically, to find the names and current values for all parametersfor a given estimator, use:
estimator.get_params()
A search consists of:
sklearn.svm.SVC()
);GridSearchCV
exhaustively considersall parameter combinations, while
RandomizedSearchCV
can sample agiven number of candidates from a parameter space with a specifieddistribution.
Gird Search:具体说,就是每种参数确定好几个要尝试的值,然后像一个网格一样,把所有参数值的组合遍历一下。优点是实现简单暴力,如果能全部遍历的话,结果比较可靠。缺点是太费时间了,特别像神经网络,一般尝试不了太多的参数组合。
param_grid = [ {'C': [1, 10, 100, 1000], 'kernel': ['linear']}, {'C': [1, 10, 100, 1000], 'gamma': [0.001, 0.0001], 'kernel': ['rbf']}, ]最好的实例 Nested versus non-nested cross-validationfor an example of Grid Search within a cross validation loop on the irisdataset
Random Search:先用Gird Search的方法,得到所有候选参数,然后每次从中随机选择进行训练。
sklearn.model_selection.RandomizedSearchCV(estimator, param_distributions, n_iter=10, scoring=None, fit_params=None, n_jobs=1, iid=True, refit=True, cv=None, verbose=0, pre_dispatch='2*n_jobs', random_state=None, error_score='raise', return_train_score=True)
two main benefits over an exhaustive search:
{'C': scipy.stats.expon(scale=100), 'gamma': scipy.stats.expon(scale=.1), 'kernel': ['rbf'], 'class_weight':['balanced', None]}In principle, any function can be passed that provides a
rvs
(randomvariate sample) method to sample a value.
实例Comparing randomized search and grid search for hyperparameter estimation compares the usage and efficiencyof randomized search and grid search.
[机器学习模型的评价指标和方法 ]
estimator类必须有的方法是有:get_params, set_params(**params), fit(x,y), predict(new_samples), score(x, y_true)。其中有的可以直接从from sklearn.base import BaseEstimator中继承。
使用验证集(也就是开发集吧)来进行模型选择,输入到grid_search中。development set (tobe fed to the GridSearchCV instance)
GridSearchCV
and
RandomizedSearchCV
都是并行运行的,
by using the keyword
n_jobs=-1
.
参数输入后模型出错会导致整个grid serach失败,但是可以通过Setting error_score=0(or =np.NaN)来解决。失败的issuing awarning and setting the score for that fold to 0 (or NaN)。
[机器学习模型选择:调参参数选择 ]
某小皮
...
Some models can offer an information-theoretic closed-form formula of theoptimal estimate of the regularization parameter by computing a singleregularization path (instead of several when using cross-validation).
Here is the list of models benefitting from the Aikike InformationCriterion (AIC) or the Bayesian Information Criterion (BIC) for automatedmodel selection:
linear_model.LassoLarsIC ([criterion, ...]) |
Lasso model fit with Lars using BIC or AIC for model selection |
可以参考prml。
集成方法因为有数据抽样,多余的可以直接用于模型选择,而不需要额外独立的验证集。This left out portion can be used to estimate the generalization errorwithout having to rely on a separate validation set. This estimatecomes “for free” as no additional data is needed and can be used formodel selection.
ensemble.RandomForestClassifier ([...]) |
A random forest classifier. |
ensemble.RandomForestRegressor ([...]) |
A random forest regressor. |
ensemble.ExtraTreesClassifier ([...]) |
An extra-trees classifier. |
ensemble.ExtraTreesRegressor ([n_estimators, ...]) |
An extra-trees regressor. |
ensemble.GradientBoostingClassifier ([loss, ...]) |
Gradient Boosting for classification. |
ensemble.GradientBoostingRegressor ([loss, ...]) |
Gradient Boosting for regression. |
考虑到了不同参数对应的实验结果值,因此更节省时间。和网络搜索相比简直就是老牛和跑车的区别。具体原理可以参考这个论文: Practical Bayesian Optimization of Machine Learning Algorithms ,这里同时推荐两个实现了贝叶斯调参的Python库,可以上手即用:
[Auto-scaling scikit-learn with Apache Spark]
from: http://blog.csdn.net/pipisorry/article/details/52268947
ref: [3.2. Tuning the hyper-parameters of an estimator]*
[python并行调参——scikit-learn grid_search]*
[Parameter estimation using grid search with cross-validation*]
参数资料
Practical recommendations for gradient-based training of deep architectures by Yoshua Bengio (2012)
Efficient BackProp, by Yann LeCun, Léon Bottou, Genevieve Orr and Klaus-Robert Müller
Neural Networks: Tricks of the Trade, edited by Grégoire Montavon, Geneviève Orr, and Klaus-Robert Müller.