scikit-learn(工程中用的相对较多的模型介绍):1.12. Multiclass and multilabel algorithms


今天我们关注在工程中用的相对较多的 Multiclass and multilabel algorithms。

warning:scikit-learn的所有分类器都是可以do multiclass classification out-of-the-box(可直接使用),所以没必要使用本节介绍的 sklearn.multiclass module,这里只是讲些知识点。

Below is a summary of the classifiers supported by scikit-learn grouped by strategy; you don’t need the meta-estimators in this class if you’re using one of these unless you want custom multiclass behavior:

  • Inherently multiclass: Naive Bayes, sklearn.lda.LDA, Decision TreesRandom ForestsNearest Neighbors, setting “multi_class=multinomial” in sklearn.linear_model.LogisticRegression.
  • One-Vs-One: sklearn.svm.SVC.
  • One-Vs-All: all linear models except sklearn.svm.SVC.

Some estimators also support multioutput-multiclass classification tasks Decision TreesRandom ForestsNearest Neighbors.


Multiclass classification means a classification task with more than two classes;但是一个sample只能属于其中一个类别(相当于一个多元分类)

Multilabel classification assigns to each sample a set of target labels.一个sample可以属于多个类别(相当于多个二元分类)。

Multioutput-multiclass classification and multi-task classification means that a single estimator has to handle several joint classification tasks.(相当于多个多元分类:The set of labels can be different for each output variable. For instance a sample could be assigned “pear” for an output variable that takes possible values in a finite set of species such as “pear”, “apple”, “orange” and “green” for a second output variable that takes possible values in a finite set of colors such as “green”, “red”, “orange”, “yellow”...
