这里可以看一下帮助文档中的描述
predict_proba(X)Probability estimates.
概率估计The returned estimates for all classes are ordered by the label of classes.
这个方法的返回所有类别的概率的估计值按照类别的标签排序For a multi_class problem, if multi_class is set to be “multinomial” the softmax function is used to find the predicted probability of each class. Else use a one-vs-rest approach, i.e calculate the probability of each class assuming it to be positive using the logistic function. and normalize these values across all the classes.
对于一个多分类问题,如果multi_clas桉树被设置为“multinomial” ,则softmax函数会被用于预测每个类别的预测概率。否则,使用ovr方法,即使用sigmoid函数,依次假设每个类为唯一正类,计算每个类别的概率,然后对所有类预测的概率进行归一化
这里我们以load_iris数据集为例,这是一个三分类数据集
from sklearn.datasets import load_iris
import numpy as np
X,y = load_iris(return_X_y=True)
逻辑回归的multi_class没有’ovo’选项
这里我们先说’ovr’情况,建立多分类情况下的模型
from sklearn.linear_model import LogisticRegression
clf = LogisticRegression(max_iter=1000,multi_class='ovr').fit(X,y)
clf.predict_proba(X[:5,:])
---
array([[8.96807569e-01, 1.03191359e-01, 1.07219602e-06],
[7.78979389e-01, 2.21019299e-01, 1.31168933e-06],
[8.34864184e-01, 1.65134802e-01, 1.01485082e-06],
[7.90001986e-01, 2.09996107e-01, 1.90723705e-06],
[9.12050403e-01, 8.79485212e-02, 1.07537143e-06]])
按行求和
clf.predict_proba(X[:5,:]).sum(1)
---
array([1., 1., 1., 1., 1.])
每个类设置为唯一正类,并进行建模,然后预测
lst = []
for i in range(3):
y_tmp = np.int32(y == i)
clf = LogisticRegression(max_iter=1000,multi_class='ovr').fit(X,y_tmp)
lst.append(clf.predict_proba(X[:5,:]))
lst
---
[array([[0.01593515, 0.98406485],
[0.02270572, 0.97729428],
[0.0138773 , 0.9861227 ],
[0.02290294, 0.97709706],
[0.01395774, 0.98604226]]),
array([[0.88676836, 0.11323164],
[0.72271295, 0.27728705],
[0.80494651, 0.19505349],
[0.74027081, 0.25972919],
[0.90491648, 0.09508352]]),
array([[9.99998823e-01, 1.17651820e-06],
[9.99998354e-01, 1.64562311e-06],
[9.99998801e-01, 1.19871885e-06],
[9.99997641e-01, 2.35892535e-06],
[9.99998837e-01, 1.16261302e-06]])]
归一化
lst[0][:,1] / (lst[0][:,1] + lst[1][:,1] + lst[2][:,1])
---
array([0.89680757, 0.77897939, 0.83486418, 0.79000199, 0.9120504 ])
clf.predict_proba(X[:5,:])[:,0]
---
array([0.89680757, 0.77897939, 0.83486418, 0.79000199, 0.9120504 ])
再说一下’multinomial’的情况
这里没有分成三个小的逻辑回归然后对输出的概率进行求和,一个是我确实没弄出来,其次帮助文档没有提及需要像ovr一样对每个类别设为正类,以及源码中也是直接调用了decision_function直接使用的
建模
clf = LogisticRegression(max_iter=1000,multi_class='multinomial').fit(X,y)
clf.predict_proba(X[:5,:])
---
array([[9.81582009e-01, 1.84179761e-02, 1.45003883e-08],
[9.71335505e-01, 2.86644651e-02, 3.01881553e-08],
[9.85273339e-01, 1.47266488e-02, 1.23397518e-08],
[9.76063609e-01, 2.39363516e-02, 3.97048513e-08],
[9.85232612e-01, 1.47673762e-02, 1.20065646e-08]])
如源码中所说,调用decision_function
clf.decision_function(X[:5,:])
---
array([[ 7.33544637, 3.35960795, -10.69505432],
[ 6.93658219, 3.41356847, -10.35015066],
[ 7.46628805, 3.26302765, -10.72931571],
[ 6.9085648 , 3.20043533, -10.10900013],
[ 7.47446403, 3.27400671, -10.74847074]])
对decision_function返回的结果代入softmax函数
np.exp(clf.decision_function(X[:5,:])[:,0]).ravel() / np.exp(clf.decision_function(X[:5,:])).sum(1)
---
array([0.98158201, 0.9713355 , 0.98527334, 0.97606361, 0.98523261])
clf.predict_proba(X[:5,:])[:,0]
---
array([0.98158201, 0.9713355 , 0.98527334, 0.97606361, 0.98523261])