逻辑回归-为什么逻辑回归predict_proba返回结果按行求和为1

这里可以看一下帮助文档中的描述
predict_proba(X)

Probability estimates.
概率估计

The returned estimates for all classes are ordered by the label of classes.
这个方法的返回所有类别的概率的估计值按照类别的标签排序

For a multi_class problem, if multi_class is set to be “multinomial” the softmax function is used to find the predicted probability of each class. Else use a one-vs-rest approach, i.e calculate the probability of each class assuming it to be positive using the logistic function. and normalize these values across all the classes.
对于一个多分类问题,如果multi_clas桉树被设置为“multinomial” ,则softmax函数会被用于预测每个类别的预测概率。否则,使用ovr方法,即使用sigmoid函数,依次假设每个类为唯一正类,计算每个类别的概率,然后对所有类预测的概率进行归一化

  这里我们以load_iris数据集为例,这是一个三分类数据集

from sklearn.datasets import load_iris
import numpy as np

X,y = load_iris(return_X_y=True)

逻辑回归的multi_class没有’ovo’选项

  这里我们先说’ovr’情况,建立多分类情况下的模型

from sklearn.linear_model import LogisticRegression

clf = LogisticRegression(max_iter=1000,multi_class='ovr').fit(X,y)
clf.predict_proba(X[:5,:])
---
array([[8.96807569e-01, 1.03191359e-01, 1.07219602e-06],
       [7.78979389e-01, 2.21019299e-01, 1.31168933e-06],
       [8.34864184e-01, 1.65134802e-01, 1.01485082e-06],
       [7.90001986e-01, 2.09996107e-01, 1.90723705e-06],
       [9.12050403e-01, 8.79485212e-02, 1.07537143e-06]])

按行求和

clf.predict_proba(X[:5,:]).sum(1)
---
array([1., 1., 1., 1., 1.])

每个类设置为唯一正类,并进行建模,然后预测

lst = []
for i in range(3):
    y_tmp = np.int32(y == i)
    clf = LogisticRegression(max_iter=1000,multi_class='ovr').fit(X,y_tmp)
    lst.append(clf.predict_proba(X[:5,:]))

lst
---
[array([[0.01593515, 0.98406485],
        [0.02270572, 0.97729428],
        [0.0138773 , 0.9861227 ],
        [0.02290294, 0.97709706],
        [0.01395774, 0.98604226]]),
 array([[0.88676836, 0.11323164],
        [0.72271295, 0.27728705],
        [0.80494651, 0.19505349],
        [0.74027081, 0.25972919],
        [0.90491648, 0.09508352]]),
 array([[9.99998823e-01, 1.17651820e-06],
        [9.99998354e-01, 1.64562311e-06],
        [9.99998801e-01, 1.19871885e-06],
        [9.99997641e-01, 2.35892535e-06],
        [9.99998837e-01, 1.16261302e-06]])]

归一化

lst[0][:,1] / (lst[0][:,1] + lst[1][:,1] + lst[2][:,1])
---
array([0.89680757, 0.77897939, 0.83486418, 0.79000199, 0.9120504 ])

clf.predict_proba(X[:5,:])[:,0]
---
array([0.89680757, 0.77897939, 0.83486418, 0.79000199, 0.9120504 ])

  再说一下’multinomial’的情况

这里没有分成三个小的逻辑回归然后对输出的概率进行求和,一个是我确实没弄出来,其次帮助文档没有提及需要像ovr一样对每个类别设为正类,以及源码中也是直接调用了decision_function直接使用的

建模

clf = LogisticRegression(max_iter=1000,multi_class='multinomial').fit(X,y)
clf.predict_proba(X[:5,:])
---
array([[9.81582009e-01, 1.84179761e-02, 1.45003883e-08],
       [9.71335505e-01, 2.86644651e-02, 3.01881553e-08],
       [9.85273339e-01, 1.47266488e-02, 1.23397518e-08],
       [9.76063609e-01, 2.39363516e-02, 3.97048513e-08],
       [9.85232612e-01, 1.47673762e-02, 1.20065646e-08]])

如源码中所说,调用decision_function

clf.decision_function(X[:5,:])
---
array([[  7.33544637,   3.35960795, -10.69505432],
       [  6.93658219,   3.41356847, -10.35015066],
       [  7.46628805,   3.26302765, -10.72931571],
       [  6.9085648 ,   3.20043533, -10.10900013],
       [  7.47446403,   3.27400671, -10.74847074]])

对decision_function返回的结果代入softmax函数

np.exp(clf.decision_function(X[:5,:])[:,0]).ravel() / np.exp(clf.decision_function(X[:5,:])).sum(1)
---
array([0.98158201, 0.9713355 , 0.98527334, 0.97606361, 0.98523261])

clf.predict_proba(X[:5,:])[:,0]
---
array([0.98158201, 0.9713355 , 0.98527334, 0.97606361, 0.98523261])

你可能感兴趣的:(烧灯续昼的笔记,逻辑回归,人工智能)