以前只是在Coursera上吴恩达的《Machine Learning》课时用Matlab写过线性回归的源代码,这些东西虽然在python中有现成的库可以调用,但为了练手,还是随便写了一个线性回归的源代码。逻辑回归原理和线性回归类似,不过要乘一个diag矩阵。数据集是课后作业中的.mat文件不过由于只是练手的缘故,并没有特别注意输入X的数据格式的问题,比如这里默认X的feature个数为1,。感觉numpy对矩阵操作还是不如Matlab方便
# coding=utf-8
# 版权所有,侵权不究
# typhoonbxq
# the University of Hong Kong
import scipy.io as scio
import numpy as np
import matplotlib.pyplot as plt
class LR:
def __init__(self,X,Y):
self.m = len(Y) # num of training examples
self.n = len(X[0]) + 1 # num of features, adding "1"
self.X = np.hstack((np.ones((self.m,1)),X)) # Add '1' to each training example
self.Y = Y.reshape((self.m,1))
self.theta = 0
self.cost = []
def fit(self):
theta = np.zeros((self.n,1)) + 0.01 # Initiate the theta with a small number
cost_old = np.array([[0]])
M = self.X.dot(theta) - self.Y
cost_new = M.T.dot(M) * 1.0 / 2 /self.m
# print 'the initail cost is',cost_new
i = 0
while (np.abs(cost_new - cost_old) > 0.001 ):
cost_old = cost_new
theta_grad = X.T.dot(self.X.dot(theta) - self.Y) / self.m
# print 'in the %d-th interation'%(i),'theta:',theta
# print 'theta_grad :',theta_grad
theta = theta - theta_grad * 0.001 # Actually, I have no idea how to properly set this index,
# Anyway, practice and the curve will tell me everything
M = self.X.dot(theta) - self.Y # M is a temporary variable
cost_new = np.dot(M.T,M) / self.m # Divided by self.m, which is the num of training examples
i = i + 1 # The index of iteration is added by 1
self.cost.append(cost_new[0][0])
self.theta = theta
self.cost = np.array(self.cost)
l = len(self.cost)
axis_X = np.arange(l)
# I wanna show how the cost changes in the iteration process
# This better tells me that my code is right
plt.plot(axis_X,self.cost)
plt.xlabel('Iteration No.')
plt.ylabel('Total Cost')
plt.title('Iteration VS Cost')
plt.show()
def pred(self,X):
X = np.hstack( ( X,np.ones( (len(X),1) ) ) )
return np.dot(X,self.theta)
def plot(self):
x = np.linspace(0,30,401)
y = x * self.theta[1] + self.theta[0]
plt.plot(x,y,'r')
plt.legend(['Prediction Line'])
# Import data, and change the data format becasue the input should be in np.array format instead of list format
data = scio.loadmat('F:\\data.mat')
X = data['X']
X = np.array(X)
Y = data['Y']
Y = np.array(Y)
# Train the Lineaer Regression with the given datasets
NN = LR(X,Y)
NN.fit()
X_test = np.array([[20]])
Y_test = NN.pred(X_test)
print "When the input is %.2f , the output is %.2f" %(X_test,Y_test)
# Scatter the data as well as the calculated prediction line
plt.scatter(X[:,0],Y)
plt.hold(True)
NN.plot()
plt.show()
主要的问题出在theta_grad的系数上把,最开始取得太大了,不收敛,不过这些东西都很基础,也很好搞定
我想补充说下,求导时出现了diag矩阵,实际上就是不同的向量点乘不同的数据,如果把这些东西统一起来,用一个优美的表达式去解释的话,就是一个矩阵(包含前面所说的不同向量)叉乘一个diag矩阵,从式子上来说,这样做会更优美一些。。。。。。。。