机器学习—逻辑回归

逻辑回归

往期相关内容

单变量线性回归
多变量线性回归

简介

逻辑回归的本质就是利用类似于线性回归的方法,解决分类问题。其本质是一个分类算法。

模型表示

h θ ( x ) = g ( θ T X ) h_\theta(x)=g(\theta^TX) hθ(x)=g(θTX)
其中,X代表特征向量;g代表逻辑函数,常用sigmoid function:
g ( z ) = 1 1 + e − z g(z)=\frac{1}{1+e^{-z}} g(z)=1+ez1
该函数的图像为:
机器学习—逻辑回归_第1张图片
  该模型可以理解为,对于给定的输入变量,根据选择的参数计算输出变量=1的可能性,即 h θ = P ( y = 1 ∣ x ; θ ) h_\theta=P(y=1|x;\theta) hθ=P(y=1x;θ)。例如,对于给定的x,计算达到 h θ ( x ) = 0.7 h_\theta(x)=0.7 hθ(x)=0.7,则表示x有70%的概率为正类。所有,在逻辑回归中:
y = { 0 , if  h θ ( x ) ≤ 0.5 1 , if  h θ ( x ) ≥ 0.5 y=\begin{cases} 0,& \text{if $h_\theta(x)\leq0.5$} \\ 1, & \text{if $h_\theta(x)\geq0.5$} \end{cases} y={ 0,1,if hθ(x)0.5if hθ(x)0.5

代价函数

J ( θ ) = 1 m ∑ i = 1 m C o s t ( h θ ( x ( i ) ) , y ( i ) ) J(\theta)=\frac{1}{m}\sum_{i=1}^{m}Cost(h_\theta(x^{(i)}),y^{(i)}) J(θ)=m1i=1mCost(hθ(x(i)),y(i))
其中:
C o s t ( h θ ( x ) , y ) = { − l o g ( h θ ( x ) ) if y = 1 − l o g ( 1 − h θ ( x ) ) if y = 0 Cost(h_\theta(x),y)=\begin{cases} -log(h_\theta(x))& \text{if y = 1} \\ -log(1-h_\theta(x)) & \text{if y = 0} \end{cases} Cost(hθ(x),y)={ log(hθ(x))log(1hθ(x))if y = 1if y = 0
可简化如下:
C o s t ( h θ ( x ) , y ) = − y × log ⁡ ( h θ ( x ) ) − ( 1 − y ) × log ⁡ ( 1 − h θ ( x ) ) Cost(h_\theta(x),y)=-y\times\log(h_\theta(x))-(1-y)\times\log(1-h_\theta(x)) Cost(hθ(x),y)=y×log(hθ(x))(1y)×log(1hθ(x))
带入代价函数可以得到:
J ( θ ) = − 1 m ∑ i = 1 m [ y ( i ) log ⁡ ( h θ ( x ( i ) ) ) + ( 1 − y ( i ) ) log ⁡ ( 1 − h θ ( x ( i ) ) ) ] J(\theta)=-\frac{1}{m}\sum_{i=1}^m[y^{(i)}\log(h_\theta(x^{(i)}))+(1-y^{(i)})\log(1-h_\theta(x^{(i)}))] J(θ)=m1i=1m[y(i)log(hθ(x(i)))+(1y(i))log(1hθ(x(i)))]

梯度下降

θ j : = θ j − α 1 m ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) ⋅ x j ( i ) \theta_j:=\theta_j-\alpha\frac{1}{m}\sum_{i=1}^m(h_\theta(x^{(i)})-y^{(i)})\cdot x_j^{(i)} θj:=θjαm1i=1m(hθ(x(i))y(i))xj(i)
除了梯度下降法,我们还可以使用共轭梯度法 BFGS、L-BFGS 对代价函数进行优化。

python实现

# 梯度下降逻辑回归2
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
import scipy.optimize as opt
from sklearn.metrics import classification_report as c_r


# 获取X、y矩阵
def getX(df):
    ones = pd.DataFrame({'ones': np.ones(len(df))})
    data = pd.concat([ones, df], axis=1)
    return np.array(data.iloc[:, :-1])


def gety(df):
    return np.array(df.iloc[:, -1])


# sigmoid函数
def sigmoid(z):
    return 1.0 / (1.0 + np.exp(-z))


# 代价函数
def cost(theta, X, y):
    return np.mean(-y * np.log(sigmoid(X @ theta)) - (1 - y) * np.log(1 - sigmoid(X @ theta)))


# 梯度下降
def gradientDescent(theta, X, y):
    return X.T @ (sigmoid(X @ theta) - y) / len(y)


# 预测和验证
def predict(x, theta):
    prob = sigmoid(x @ theta)
    return (prob >= 0.5).astype(int)


if __name__ == '__main__':
    # 获取数据
    data = pd.read_csv('ex2data1.txt', names=['exam1', 'exam2', 'admitted'])
    # 获取X、y,并随机生成Θ初始值
    X = getX(data)
    y = gety(data)
    theta = np.random.rand(X.shape[1])
    # 拟合参数
    res = opt.minimize(fun=cost, x0=theta, args=(X, y), method='BFGS', jac=gradientDescent)
    theta = res.x
    # 预测
    y_pred = predict(X, theta)
    print(c_r(y, y_pred))
    # 绘制边界函数
    sns.lmplot('exam1', 'exam2', data=data, hue='admitted',
               size=6, fit_reg=False, scatter_kws={"s": 50})
    x = np.arange(20, 130, step=0.1)
    y = (-theta[0] - x * theta[1]) / theta[2]
    plt.plot(x, y, 'red')
    plt.xlim(20, 130)
    plt.ylim(20, 130)
    plt.title('Decision Boundary')
    plt.show()

你可能感兴趣的:(吴恩达机器学习课程整理,机器学习)