目录
Exercise 2: Logistic Regression
1. Logistic Regression
1.1 Plotting
1.2 sigmoid function
1.3 Cost function and gradient
1.4 Optimize
1.5 Predict
2. Regularized logistic regression
2.1 Plotting
2.2 Cost function and gradient
2.3 Optimize
需要用到的库
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import scipy.optimize as opt
使用scipy的optimize库进行训练。
读取数据并画图
def read_file(file):
data = pd.read_csv(file, header=None)
data = np.array(data)
return data
def plotData(X, y):
plt.figure(figsize=(6, 4), dpi=150)
X1 = X[y == 0, :]
X2 = X[y == 1, :]
plt.plot(X1[:, 0], X1[:, 1], 'yo')
plt.plot(X2[:, 0], X2[:, 1], 'k+')
plt.xlabel('Exam 1 score')
plt.ylabel('Exam 2 score')
plt.legend(['Admitted', 'Not admitted'], loc='upper right')
plt.show()
## Load Data
data=read_file('ex2data1.txt')
X = data[:, 0:2]
y = data[:, 2]
## ==================== Part 1: Plotting ====================
print('Plotting data with + indicating (y = 1) examples and o indicating (y = 0) examples.')
plotData(X, y)
print('Program paused. Press enter to continue.')
input()
画图结果
与ex1的线性回归不同,logistic回归对线性回归的结果增加了sigmoid函数。
logistic regression函数的公式为:
sigmoid函数的公式为:
def sigmoid(x):
return 1 / (np.exp(-x) + 1)
logistic regression的损失函数为
损失函数的梯度为:
def costfunction(initial_theta, X, y):
m = np.size(y, 0)
cost = (-y.T.dot(np.log(sigmoid(X.dot(initial_theta)))) - \
(1 - y).T.dot(np.log(1-sigmoid(X.dot(initial_theta))))) / m
return cost
def gradient(initial_theta, X, y):
m, n = X.shape
initial_theta = initial_theta.reshape((n, 1))
grad = X.T.dot(sigmoid(X.dot(initial_theta)) - y) / m
return grad.flatten()
## ============ Part 2: Compute Cost and Gradient ============
m, n = X.shape
X = np.c_[np.ones(m), X]
initial_theta = np.zeros((n + 1, 1))
y = y.reshape((m, 1))
#cost, grad = costFunction(initial_theta, X, y)
cost, grad = costfunction(initial_theta, X, y), gradient(initial_theta, X, y)
print('Cost at initial theta (zeros): %f' % cost);
print('Expected cost (approx): 0.693');
print('Gradient at initial theta (zeros): ');
print('%f %f %f' % (grad[0], grad[1], grad[2]))
print('Expected gradients (approx): -0.1000 -12.0092 -11.2628')
#
theta1 = np.array([[-24], [0.2], [0.2]], dtype='float64')
cost, grad = costfunction(theta1, X, y), gradient(theta1, X, y)
#cost, grad = costFunction(theta1, X, y)
print('Cost at initial theta (zeros): %f' % cost);
print('Expected cost (approx): 0.218');
print('Gradient at initial theta (zeros): ');
print('%f %f %f' % (grad[0], grad[1], grad[2]))
print('Expected gradients (approx): 0.043 2.566 2.647')
print('Program paused. Press enter to continue.')
input()
算法输出结果:
使用scipy库里的optimize库进行训练,得到最终的theta结果。
## ============= Part 3: Optimizing using fminunc =============
initial_theta = np.zeros(n + 1)
result = opt.minimize(fun=costfunction, x0=initial_theta, args=(X, y), method='SLSQP', jac=gradient)
print('Cost at theta found by fminunc: %f' % result['fun'])
print('Expected cost (approx): 0.203')
print('theta:')
print('%f %f %f' % (result['x'][0], result['x'][1], result['x'][2]))
print('Expected theta (approx):')
print(' -25.161 0.206 0.201')
print('Program paused. Press enter to continue.')
input()
训练输出结果:
预测中大于0.5的为1,小于0.5的为0。
def predict(theta, X):
m = np.size(theta, 0)
rst = sigmoid(X.dot(theta.reshape(m, 1)))
rst = rst > 0.5
return rst
## ============== Part 4: Predict and Accuracies ==============
prob = sigmoid(np.array([1, 45, 85], dtype='float64').dot(result['x']))
print('For a student with scores 45 and 85, we predict an admission ' \
'probability of %.3f' % prob)
print('Expected value: 0.775 +/- 0.002\n')
p = predict(result['x'], X)
print('Train Accuracy: %.1f%%' % (np.mean(p == y) * 100))
print('Expected accuracy (approx): 89.0%\n')
预测输出结果:
分类可视化结果
读取数据并画图
损失函数和梯度
def costfunction(initial_theta, X, y, lamb):
m, n = X.shape
initial_theta = initial_theta.reshape((n, 1))
y = y.reshape((m, 1))
cost = (-y.T.dot(np.log(sigmoid(X.dot(initial_theta)))) - \
(1 - y).T.dot(np.log(1-sigmoid(X.dot(initial_theta))))) / m \
+ lamb / (2 * m) * initial_theta.T.dot(initial_theta)
return cost
def gradient(initial_theta, X, y, lamb):
m, n = X.shape
y = y.reshape((m, 1))
initial_theta = initial_theta.reshape((n, 1))
grad = X.T.dot(sigmoid(X.dot(initial_theta)) - y) / m \
+ lamb / m * initial_theta
return grad
initial_theta = np.ones(n)
lamb = 1
cost = costfunction(initial_theta, X, y, lamb)
grad = gradient(initial_theta, X, y, lamb)
result = opt.minimize(fun=costfunction, x0=initial_theta, args=(X, y, lamb), method='SLSQP', jac=gradient)
p = predict(result['x'], X)
print('Train Accuracy: %.1f%%' % (np.mean(p.flatten() == y) * 100))
print('Expected accuracy (approx): 83.1%\n')
最终计算的准确率结果为82.2%