1. 主要的实现forward propagation algorithm前向传播算法,通过初始的theta得到预测函数和J函数。
2. 然后通过back propagation algorithm后向传播算法,使用得到的h和J值,可以通过gradient descent计算theta_grade来不断的收敛到最小的J函数。
主要代码如下:
function [J grad] = nnCostFunction(nn_params, ...
input_layer_size, ...
hidden_layer_size, ...
num_labels, ...
X, y, lambda)
%NNCOSTFUNCTION Implements the neural network cost function for a two layer
%neural network which performs classification
% [J grad] = NNCOSTFUNCTON(nn_params, hidden_layer_size, num_labels, ...
% X, y, lambda) computes the cost and gradient of the neural network. The
% parameters for the neural network are "unrolled" into the vector
% nn_params and need to be converted back into the weight matrices.
%
% The returned parameter grad should be a "unrolled" vector of the
% partial derivatives of the neural network.
%
% Reshape nn_params back into the parameters Theta1 and Theta2, the weight matrices
% for our 2 layer neural network
Theta1 = reshape(nn_params(1:hidden_layer_size * (input_layer_size + 1)), ...
hidden_layer_size, (input_layer_size + 1));
Theta2 = reshape(nn_params((1 + (hidden_layer_size * (input_layer_size + 1))):end), ...
num_labels, (hidden_layer_size + 1));
% Setup some useful variables
m = size(X, 1);
% You need to return the following variables correctly
J = 0;
Theta1_grad = zeros(size(Theta1));
Theta2_grad = zeros(size(Theta2));
% ====================== YOUR CODE HERE ======================
% Instructions: You should complete the code by working through the
% following parts.
%
% Part 1: Feedforward the neural network and return the cost in the
% variable J. After implementing Part 1, you can verify that your
% cost function computation is correct by verifying the cost
% computed in ex4.m
disp(size(y));
%transfer y
tmpy =[y zeros(m,num_labels-1)];
for i=1:num_labels,
for j=1:m,
if y(j)==i
tmpy(j,i)=1;
else
tmpy(j,i)=0;
end;
end;
end;
%tmpy =[y zeros(m,num_labels-1)];
%tmpy=eye(num_
%compute hx
a1=X;
a1=[ones(m,1) X];
z2=a1*Theta1';
a2=sigmoid(z2);
a2=[ones(size(a2,1),1) a2];
z3=a2*Theta2';
a3=sigmoid(z3);
h=a3;
disp(size(h));
%Step1: compute cost fuction
Jtmp=0;
for i=1:m,
for k=1:size(tmpy,2),
Jtmp=Jtmp+(-tmpy(i,k)*log(h(i,k))-(1-tmpy(i,k))*log(1-h(i,k)));
end;
end;
disp(size(J));
disp(size(Theta1));
J=(1/m)*Jtmp;
J1=0;
J2=0;
for i=1:size(Theta1,1)
for k=2:size(Theta1,2)
J1=J1+Theta1(i,k)^2;
end;
end;
%J2 = sum(sum((Theta1(:,2:size(Theta1,2))').^2 )')+sum(sum((Theta2(:,2:size(Theta2,2))').^2 )');
J1=((lambda)/(2*m))*J1;
for i=1:size(Theta2,1)
for k=2:size(Theta2,2)
J2=J2+Theta2(i,k)^2;
end;
end;
J2=((lambda)/(2*m))*J2;
J=J+J1+J2;
%
% Part 2: Implement the backpropagation algorithm to compute the gradients
% Theta1_grad and Theta2_grad. You should return the partial derivatives of
% the cost function with respect to Theta1 and Theta2 in Theta1_grad and
% Theta2_grad, respectively. After implementing Part 2, you can check
% that your implementation is correct by running checkNNGradients
%
% Note: The vector y passed into the function is a vector of labels
% containing values from 1..K. You need to map this vector into a
% binary vector of 1's and 0's to be used with the neural network
% cost function.
%
d3=a3-tmpy;
disp("d3");
%disp(size(d3));
d2=(d3*Theta2(:,2:size(Theta2,2))).*sigmoidGradient(z2);
Theta1_grad = (1/m).*((d2)'*a1);
Theta2_grad=(1/m).*((d3)'*a2);
Theta1_grad(:,2:end) = Theta1_grad(:,2:end)+(lambda/m).*Theta1(:,2:end);
Theta2_grad(:,2:end)=Theta2_grad(:,2:end)+(lambda/m).*Theta2(:,2:end);
单元测试文件
Unit Test for [Feedforward and Cost Function]
[J] = nnCostFunction(sec(1:1:32)', 2, 4, 4,reshape(tan(1:32), 16, 2) / 5, 1 + mod(1:16,4)', 0)
J = 10.93
Unit Test for [Regularized CostFunction]
[J] = nnCostFunction(sec(1:1:32)', 2, 4, 4, reshape(tan(1:32),16, 2) / 5, 1 + mod(1:16,4)', 0.1)
J = 170.99
Unit Test for [Sigmoid Gradient]
[sigGrad] = sigmoidGradient(sec(1:1:5)')
sigGrad =
0.117342
0.076065
0.195692
0.146323
0.027782
Unit Test for [randInitializeWeights]
rand("state", 1:10)
randInitializeWeights(3,3)
ans =
-1.5850e-02 1.0170e-01 2.9234e-02 7.2907e-02
1.9190e-02 7.5183e-02 -1.0621e-01 1.1156e-01
-7.8807e-02 3.8784e-04 3.0667e-02 7.5665e-02
Unit Test for [Neural Network Gradient(Backpropagation)] [providedby Sindhuja V]
[J grad] = nnCostFunction(sec(1:1:32)', 2, 4, 4,reshape(tan(1:32), 16, 2) / 5, 1 + mod(1:16,4)', 0)
J = 10.931
grad =
3.0518e-001
7.1044e-002
5.1307e-002
6.2115e-001
-7.4310e-002
5.2173e-002
-2.9711e-003
-5.5435e-002
-9.5647e-003
-4.6995e-002
1.0499e-004
9.0452e-003
-7.4506e-002
7.4997e-001
-1.7991e-002
4.4328e-001
-5.9840e-002
5.3455e-001
-7.8995e-002
3.5278e-001
-5.3284e-003
8.4440e-002
-3.4384e-002
6.6441e-002
-3.4314e-002
3.3322e-001
-7.0455e-002
1.5063e-001
-1.7708e-002
2.7170e-001
7.1129e-002
1.4488e-001
% grad(1:12) are Theta1_grad
% grad(13:32) are Theta2_grad
Unit Test for [Regularized Gradient]
[J grad] = nnCostFunction(sec(1:1:32)', 2, 4, 4,reshape(tan(1:32), 16, 2) / 5, 1 + mod(1:16,4)', 0.1)
J = 170.99
grad =
0.3051843
0.0710438
0.0513066
0.6211486
-0.0522766
0.0586827
0.0053191
-0.0983900
-0.0164243
-0.0544438
1.4123116
0.0164517
-0.0745060
0.7499671
-0.0179905
0.4432801
-0.0825542
0.5440175
-0.0726739
0.3680935
-0.0167392
0.0781902
-0.0461142
0.0811755
-0.0280090
0.3428785
-0.0918487
0.1441408
-0.0260627
0.3122174
0.0779614
0.1523740
% grad(1:12) are Theta1_grad
% grad(13:32) are Theta2_grad
代码和指导链接如下: http://pan.baidu.com/s/1kT1dPFh