逻辑回归--Octave实现

The logistic regression cost function is convex, so gradient descent will always find the global minimum.

问题一:采用逻辑回归

Suppose that you are the administrator of a university department and you want to determine each applicant's chance of admission based on their results on two exams. You have historical data from previous applicants that you can use as a training set for logistic regression. For each training example, you have the applicant's scores on two exams and the admissions decision.
Your task is to build a classi cation model that estimates an applicant's probability of admission based the scores from those two exams. This outline and the framework code in ex2.m will guide you through the exercise.

1. 最主要的步骤是计算cost fucnction和 gradient

function [J, grad] = costFunction(theta, X, y)
% Initialize some useful values
m = length(y); % number of training examples

% You need to return the following variables correctly 
J = 0;
grad = zeros(size(theta));

Jtmp=0;
h= zeros(m,1);

%step1:compute hx
hx = X*theta;

%step2:compute h(hx)
h = sigmoid(hx);

%step3:compute cost function's sum part
for i=1:m,
    Jtmp=Jtmp+(-y(i)*log(h(i))-(1-y(i))*log(1-h(i)));
end;
J=(1/m)*Jtmp;

%step4:compute gradient's sum part    
sum1 =zeros(size(X,2),1);%#features row
for i=1:m
    sum1 = sum1+(h(i)-y(i)).*X(i,:)';
end;
    
grad= (1/m)*sum1;

2. 其中调用的sigmoid(S型函数)如下:

function g = sigmoid(z)
%SIGMOID Compute sigmoid functoon
%   J = SIGMOID(z) computes the sigmoid of z.

% You need to return the following variables correctly 
g = zeros(size(z));
for i =1:size(z,1)
    for j =1:size(z,2)
        g(i,j)=1/(1+e^(-z(i,j)));
    end;
end;

3.调用octave的内建函数fminunc();来获得最优的theta和最小的cost。

[cost, grad] = costFunction(initial_theta, X, y); %output cost with intitial_theta, grad will be used in fminunc func

%  Set options for fminunc
options = optimset('GradObj', 'on', 'MaxIter', 400);%on means uses grad varient

%  Run fminunc to obtain the optimal theta
%  This function will return theta and the cost 
[theta, cost] = ...
	fminunc(@(t)(costFunction(t, X, y)), initial_theta, options);%output optimal cost(formal cost varable be changed) with final theta

4. 带入需要预测的值,带入h(x*theta)进行预测

prob = sigmoid([1 45 85] * theta);
fprintf(['For a student with scores 45 and 85, we predict an admission ' ...
         'probability of %f\n\n'], prob);


5. 由training set统计预测的准确性,predict函数如下:

function p = predict(theta, X)

m = size(X, 1); % Number of training examples

% You need to return the following variables correctly
p = zeros(m, 1);
%step1:compute hx
hx = X*theta;

%step2:compute h(hx)
for i =1:m
    if sigmoid(hx(i))>= 0.5,
        p(i) = 1;
    else
       p(i) = 0;
    end;
end;


% Compute accuracy on our training set
p = predict(theta, X);

fprintf('Train Accuracy: %f\n', mean(double(p == y)) * 100);

问题二:采用Generalized Logistic Regression

问题描述如下:In this part of the exercise, you will implement regularized logistic regression to predict whether microchips from a fabrication plant passes quality assurance (QA). During QA, each microchip goes through various tests to ensure it is functioning correctly. Suppose you are the product manager of the factory and you have the test results for some microchips on two di erent tests. From these two tests, you would like to determine whether the microchips should be accepted or rejected. To help you make the decision, you have a dataset of test results on past microchips, from which you can build a logistic regression model. You will use another script,ex2 reg.m to complete this portion of the exercise.

如果画出训练集的图形如下:

逻辑回归--Octave实现_第1张图片

所以如果线性的decision boundary是一定会出现欠拟合(underfitting),于是我们需要增加特征,从x1,x2生成更多的特征。从3项增加到28项。

逻辑回归--Octave实现_第2张图片

但是过多的特征项容易造成过拟合(overfitting),需要使用Generalization来抑制过拟合。需要注意theta0不参与cost function和梯度的计算。并且grad(1)也不需要。

costFunctionReg函数如下:

function [J, grad] = costFunctionReg(theta, X, y, lambda)

% Initialize some useful values
m = length(y); % number of training examples

% You need to return the following variables correctly 
J = 0;
grad = zeros(size(theta));

Jtmp=0;
h= zeros(m,1);

%step1:compute hx
hx = X*theta;

%step2:compute h(hx)
h = sigmoid(hx);

%step3:compute cost function's sum part
for i=1:m,
    Jtmp=Jtmp+(-y(i)*log(h(i))-(1-y(i))*log(1-h(i)));
end;
J=(1/m)*Jtmp + (lambda/(2*m))*sum(theta(2:size(X,2)).^2);

%step4:compute gradient's sum part    
sum1 =zeros(size(X,2),1);%#features row
for i=1:m
    sum1 = sum1+(h(i)-y(i)).*X(i,:)';
end;
grad(1)= (1/m)*sum1(1);
grad(2:size(X,2))= (1/m)*sum1(2:size(X,2)) + (lambda/m).*theta(2:size(X,2));

下面是单元测试的结果,可以测试每个函数是否正确

Note: Unit tests are not required to have the X matrix properly formatted as a set of training examples. Values in the first column may not be exclusively 1's. That is totally OK. The X data is arbitrary, and is provided merely to exercise your cost and gradient functions.

Unit Tests for sigmoid()

% sigmoid() Test Case #1
>> sigmoid([1 2 3])
ans =  0.73106   0.88080   0.95257

% sigmoid() Test Case #2 (updated)
>>sigmoid(-[1 2 3]') ans =
  0.268941
  0.119203
  0.047426

sigmoid() Test Case #3: 

逻辑回归--Octave实现_第3张图片

Unit test for costFunction()

逻辑回归--Octave实现_第4张图片

 

Unit tests for predict().

>> predict([0 1 0]',magic(3))
ans =

  1
  1
  1

>> predict([2 1 -9]',magic(3))
ans =

  0
  0
  0

Unit test for costFunctionReg, with X being non-square:

(the first instance is unregularized, the second instance is regularized)
逻辑回归--Octave实现_第5张图片

Here are some additional results for the second costFunctionReg() unit test. This splits-out the results for the unregularized and regularized terms for each of Cost and Gradient:
J unregularized term = 4.6832
J regularized term = 3.000
grad unregularized vector: 
    0.31722 
    0.87232
    1.64812
    2.23787
grad regularized vector:
   -1.000  ; corresponds to grad(2)
   1.000   ; corresponds to grad(3)
   2.000   ; corresponds to grad(4)



编程文件链接:http://pan.baidu.com/s/1i3FuBD7

PPT链接:http://pan.baidu.com/s/1nt1Fps1





你可能感兴趣的:(机器学习)