Logistic回归是一种分类算法,通过将线性回归预测值映射到{0, 1}之间实现预测值到概率的转换;即根据数据集对分类边界线建立回归公式,以此进行分类。
Logistic回归选择Sigmoid作为映射函数,其中Sigmoid函数及其导数如图:
选择Sigmoid函数原因:
给定一个大小为 m m m的数据集:
D = { ( x 1 , y 1 ) , ( x 2 , y 2 ) , . . . . . , ( x m , y m ) } \boldsymbol{D} = \{ (\pmb{x_1},y_1), (\pmb{x_2}, y_2),.....,(\pmb{x_m}, y_m) \} D={(x1x1x1,y1),(x2x2x2,y2),.....,(xmxmxm,ym)}
则线性回归模型的预测值为:
f ( x i ) = w T x i f(\pmb{x_i}) = \pmb{w}^T\pmb{x_i} f(xixixi)=wwwTxixixi
Logistic回归通过sigmoid函数将预测值映射到(0, 1)区间,即:
h w ( x i ) = σ ( w T x i ) = 1 1 + e − w T x i h_{\boldsymbol w}(\pmb {x_i}) = \sigma(\pmb{w}^T\pmb{x_i}) =\frac{1}{1+e^{-\boldsymbol{w}^T \boldsymbol{x_i}}} hw(xixixi)=σ(wwwTxixixi)=1+e−wTxi1其中 0 ≤ h w ( x i ) ≤ 1 0 \leq h_{\boldsymbol w}(\pmb {x_i})\leq 1 0≤hw(xixixi)≤1, w = ( w 0 , w 1 , w 2 , . . . . . , w n ) T \pmb{w} = (w_0,w_1, w_2, .....,w_n)^T www=(w0,w1,w2,.....,wn)T, x i = ( 1 , x i ( 1 ) , x i ( 2 ) , . . . . . , x i ( n ) ) T \pmb{x_i} = (1,x_i^{(1)}, x_i^{(2)},..... ,x_i^{(n)})^T xixixi=(1,xi(1),xi(2),.....,xi(n))T,n为自变量(特征值)个数, y i = y_i= yi= 0或1( i i i从1到 m m m)
Logistic回归是一种概率判别模型,常用最大似然估计(MLE)进行参数 w \pmb w www求解,其通过对数似然损失函数,定义如下:
L ( w ) = − 1 m ∑ i = 1 m [ y i log ( h w ( x i ) ) + ( 1 − y i ) log ( 1 − h w ( x i ) ) ] L(\pmb w)= \frac{-1}{m} \sum_{i=1}^{m}\left[y_i \log \big(h_{\boldsymbol w}(\pmb {x_i})\big) + (1-y_i) \log \big(1 - h_{\boldsymbol w}(\pmb {x_i})\big)\right] L(www)=m−1i=1∑m[yilog(hw(xixixi))+(1−yi)log(1−hw(xixixi))]
首先 h w ( x i ) h_{\boldsymbol w}(\pmb {x_i}) hw(xixixi)关于 w \pmb{w} www求导:
∂ h w ( x i ) ∂ w = x i h w ( x i ) ( 1 − h w ( x i ) ) \frac {\partial{h_{\boldsymbol w}(\pmb {x_i})}} {\partial {\boldsymbol{w}}}=\pmb x_i h_{\boldsymbol w}(\pmb {x_i})\big (1-h_{\boldsymbol w}(\pmb {x_i})\big) ∂w∂hw(xixixi)=xxxihw(xixixi)(1−hw(xixixi))
对 L ( w ) L(\pmb {w}) L(www)关于 w \pmb{w} www求导:
∂ L ( w ) ∂ w = − 1 m ∑ i = 1 m [ x i y i ( 1 − h w ( x i ) ) − x i ( 1 − y i ) h w ( x i ) ] = − 1 m ∑ i = 1 m [ x i y i − x i y i h w ( x i ) − x i h w ( x i ) + x i y i h w ( x i ) ] = − 1 m ∑ i = 1 m x i [ y i − h w ( x i ) ] \begin{aligned} \frac {\partial{L(\pmb {w})}} {\partial {\boldsymbol{w}}}&=\frac{-1}{m} \sum_{i=1}^{m}\left[ \pmb x_i y_i\big(1-h_{\boldsymbol w}(\pmb {x_i})\big)-\pmb x_i(1-y_i)h_{\boldsymbol w}(\pmb {x_i})\right] \\ & =\frac{-1}{m} \sum_{i=1}^{m}\left[ \pmb x_i y_i-\pmb x_i y_ih_{\boldsymbol w}(\pmb {x_i})-\pmb x_ih_{\boldsymbol w}(\pmb {x_i})+\pmb x_i y_ih_{\boldsymbol w}(\pmb {x_i})\right] \\ &=\frac{-1}{m} \sum_{i=1}^{m}\pmb x_i\left[ y_i-h_{\boldsymbol w}(\pmb {x_i})\right] \end{aligned} ∂w∂L(www)=m−1i=1∑m[xxxiyi(1−hw(xixixi))−xxxi(1−yi)hw(xixixi)]=m−1i=1∑m[xxxiyi−xxxiyihw(xixixi)−xxxihw(xixixi)+xxxiyihw(xixixi)]=m−1i=1∑mxxxi[yi−hw(xixixi)]
因为有非线性项位于求和符号内,所以对 w \pmb w www无法直接求解(无解析解)
注:二分类Logistic回归是最大熵模型的特例,可将二分类的Logistic回归推广到一对多(one-vs-rest/all)分类或多分类的Logistic回归
参考官方文档:点击查看
Logistic回归可通过sklearn库中linear_model下的LogisticRegression类实现
有关参数:
使用案例
>>> import numpy as np
>>> from sklearn import linear_model
>>> reg = linear_model.LogisticRegression(penalty='none') #实例化Logistic回归模型对象
>>> X = np.array([[1, 1], [3, 2], [4, 7], [2, 5]]) #数据
>>> y = np.array([1, 1, 0, 0]) #类别
>>> reg.fit(X, y) #拟合求解
>>> reg.coef_
[[9.22...,-8.14...]]
>>> reg.intercept_
[9.10...]
>>> reg.classes_
[0, 1]
>>> reg.n_features_in_
2
>>> reg.decision_function([[1, 2]])
[2.03...]
>>> reg.predict_proba(X)
[[0.00..., 0.99...], [0.00..., 1.00...],
[0.99..., 0.00...], [1.00..., 0.00...]]
>>> reg.predict([[1, 3], [5, 2]])
[0, 1]
>>> reg.score(X, y)
1.0