单变量线性回归
多变量线性回归
逻辑回归的本质就是利用类似于线性回归的方法,解决分类问题。其本质是一个分类算法。
h θ ( x ) = g ( θ T X ) h_\theta(x)=g(\theta^TX) hθ(x)=g(θTX)
其中,X代表特征向量;g代表逻辑函数,常用sigmoid function:
g ( z ) = 1 1 + e − z g(z)=\frac{1}{1+e^{-z}} g(z)=1+e−z1
该函数的图像为:
该模型可以理解为,对于给定的输入变量,根据选择的参数计算输出变量=1的可能性,即 h θ = P ( y = 1 ∣ x ; θ ) h_\theta=P(y=1|x;\theta) hθ=P(y=1∣x;θ)。例如,对于给定的x,计算达到 h θ ( x ) = 0.7 h_\theta(x)=0.7 hθ(x)=0.7,则表示x有70%的概率为正类。所有,在逻辑回归中:
y = { 0 , if h θ ( x ) ≤ 0.5 1 , if h θ ( x ) ≥ 0.5 y=\begin{cases} 0,& \text{if $h_\theta(x)\leq0.5$} \\ 1, & \text{if $h_\theta(x)\geq0.5$} \end{cases} y={ 0,1,if hθ(x)≤0.5if hθ(x)≥0.5
J ( θ ) = 1 m ∑ i = 1 m C o s t ( h θ ( x ( i ) ) , y ( i ) ) J(\theta)=\frac{1}{m}\sum_{i=1}^{m}Cost(h_\theta(x^{(i)}),y^{(i)}) J(θ)=m1i=1∑mCost(hθ(x(i)),y(i))
其中:
C o s t ( h θ ( x ) , y ) = { − l o g ( h θ ( x ) ) if y = 1 − l o g ( 1 − h θ ( x ) ) if y = 0 Cost(h_\theta(x),y)=\begin{cases} -log(h_\theta(x))& \text{if y = 1} \\ -log(1-h_\theta(x)) & \text{if y = 0} \end{cases} Cost(hθ(x),y)={ −log(hθ(x))−log(1−hθ(x))if y = 1if y = 0
可简化如下:
C o s t ( h θ ( x ) , y ) = − y × log ( h θ ( x ) ) − ( 1 − y ) × log ( 1 − h θ ( x ) ) Cost(h_\theta(x),y)=-y\times\log(h_\theta(x))-(1-y)\times\log(1-h_\theta(x)) Cost(hθ(x),y)=−y×log(hθ(x))−(1−y)×log(1−hθ(x))
带入代价函数可以得到:
J ( θ ) = − 1 m ∑ i = 1 m [ y ( i ) log ( h θ ( x ( i ) ) ) + ( 1 − y ( i ) ) log ( 1 − h θ ( x ( i ) ) ) ] J(\theta)=-\frac{1}{m}\sum_{i=1}^m[y^{(i)}\log(h_\theta(x^{(i)}))+(1-y^{(i)})\log(1-h_\theta(x^{(i)}))] J(θ)=−m1i=1∑m[y(i)log(hθ(x(i)))+(1−y(i))log(1−hθ(x(i)))]
θ j : = θ j − α 1 m ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) ⋅ x j ( i ) \theta_j:=\theta_j-\alpha\frac{1}{m}\sum_{i=1}^m(h_\theta(x^{(i)})-y^{(i)})\cdot x_j^{(i)} θj:=θj−αm1i=1∑m(hθ(x(i))−y(i))⋅xj(i)
除了梯度下降法,我们还可以使用共轭梯度法 BFGS、L-BFGS 对代价函数进行优化。
# 梯度下降逻辑回归2
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
import scipy.optimize as opt
from sklearn.metrics import classification_report as c_r
# 获取X、y矩阵
def getX(df):
ones = pd.DataFrame({'ones': np.ones(len(df))})
data = pd.concat([ones, df], axis=1)
return np.array(data.iloc[:, :-1])
def gety(df):
return np.array(df.iloc[:, -1])
# sigmoid函数
def sigmoid(z):
return 1.0 / (1.0 + np.exp(-z))
# 代价函数
def cost(theta, X, y):
return np.mean(-y * np.log(sigmoid(X @ theta)) - (1 - y) * np.log(1 - sigmoid(X @ theta)))
# 梯度下降
def gradientDescent(theta, X, y):
return X.T @ (sigmoid(X @ theta) - y) / len(y)
# 预测和验证
def predict(x, theta):
prob = sigmoid(x @ theta)
return (prob >= 0.5).astype(int)
if __name__ == '__main__':
# 获取数据
data = pd.read_csv('ex2data1.txt', names=['exam1', 'exam2', 'admitted'])
# 获取X、y,并随机生成Θ初始值
X = getX(data)
y = gety(data)
theta = np.random.rand(X.shape[1])
# 拟合参数
res = opt.minimize(fun=cost, x0=theta, args=(X, y), method='BFGS', jac=gradientDescent)
theta = res.x
# 预测
y_pred = predict(X, theta)
print(c_r(y, y_pred))
# 绘制边界函数
sns.lmplot('exam1', 'exam2', data=data, hue='admitted',
size=6, fit_reg=False, scatter_kws={"s": 50})
x = np.arange(20, 130, step=0.1)
y = (-theta[0] - x * theta[1]) / theta[2]
plt.plot(x, y, 'red')
plt.xlim(20, 130)
plt.ylim(20, 130)
plt.title('Decision Boundary')
plt.show()