Machine Learning week 1 Programming Excercise 数据归一化的代码 +画代价函数的学习曲线

<span style="font-family: Arial, Helvetica, sans-serif;">function [X_norm, mu, sigma] = featureNormalize(X)</span>
%FEATURENORMALIZE Normalizes the features in X 
%   FEATURENORMALIZE(X) returns a normalized version of X where
%   the mean value of each feature is 0 and the standard deviation
%   is 1. This is often a good preprocessing step to do when
%   working with learning algorithms.

% You need to set these values correctly
X_norm = X;
mu = zeros(1, size(X, 2));
sigma = zeros(1, size(X, 2));

% ====================== YOUR CODE HERE ======================
% Instructions: First, for each feature dimension, compute the mean
%               of the feature and subtract it from the dataset,
%               storing the mean value in mu. Next, compute the 
%               standard deviation of each feature and divide
%               each feature by it's standard deviation, storing
%               the standard deviation in sigma. 
%
%               Note that X is a matrix where each column is a 
%               feature and each row is an example. You need 
%               to perform the normalization separately for 
%               each feature. 
%
% Hint: You might find the 'mean' and 'std' functions useful.
%       
mu = mean(X,1);  
sigma = std(X);  
i = 1;  
le = size(X, 2);  
while i <= le,  
    X_norm(:,i) = (X(:,i) - mu(1,i))/sigma(1,i);  
    i = i + 1;  
end;  








% ============================================================

end


featureNormalize()主要是对数据进行归一化,归一化到正态分布,对原始数据进行缩放处理,限制在一定的范围内。一般指正态化,即均值为0,方差为1。即使数据不符合正态分布,也可以采用这种方式方法,标准化后的数据有正有负。

function J = computeCost(X, y, theta)
%COMPUTECOST Compute cost for linear regression
%   J = COMPUTECOST(X, y, theta) computes the cost of using theta as the
%   parameter for linear regression to fit the data points in X and y

% Initialize some useful values
m = length(y); % number of training examples

% You need to return the following variables correctly 
J = 0;

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta
%               You should set J to the cost.


J=sum((X*theta-y).^2)/(2*m);


% =========================================================================

end

computeCost计算代价函数


unction [theta, <span style="color:#ff0000;">J_history</span>] = gradientDescent(X, y, theta, alpha, num_iters)
%GRADIENTDESCENT Performs gradient descent to learn theta
%   theta = GRADIENTDESENT(X, y, theta, alpha, num_iters) updates theta by 
%   taking num_iters gradient steps with learning rate alpha

% Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);

for iter = 1:num_iters

    % ====================== YOUR CODE HERE ======================
    % Instructions: Perform a single gradient step on the parameter vector
    %               theta. 
    %
    % Hint: While debugging, it can be useful to print out the values
    %       of the cost function (computeCost) and gradient here.
    %

    theta = theta - alpha * (X' * (X * theta - y)) / m;  
    <span style="color:#ff0000;">J_history(iter) = computeCost(X, y, theta); </span>

    % ============================================================

    % Save the cost J in every iteration    
    <span style="color:#ff0000;">J_history(iter) = computeCost(X, y, theta);</span>

end

end


gradientDescent-梯度下降法:

标红线的地方,是比较巧妙的地方,梯度下降法的过程中,存储了每次迭代得到的代价函数,就可以画出代价函数关于迭代次数的学习曲线。

详情,可以参考Andrew Ng couresa machine learning的课程week2 联系提供的代码框架

https://class.coursera.org/ml-008/assignment


你可能感兴趣的:(Machine Learning week 1 Programming Excercise 数据归一化的代码 +画代价函数的学习曲线)