机器学习课程练习(四)——softmax

前言

斯坦福的UFLDL教程每一个章节都配有练习

本文是softmax这一章节的练习的解答

练习的目的是学习softmax算法的基本知识,并且训练一个softmax手写体数字识别器

具体内容可以浏览课程网页

注意事项

1.matlab是从1开始计数的所以为了方便起见,将所有数字0的标记变为10

2.注意利用稀疏矩阵函数sparse和full生成ground_truth

3.为了防止指数函数的上溢,将所有的指数项都减去其最大值,这并不影响分类的结果

4.善于利用bsxfun函数有助于减少代码量

5.需要利用到稀疏自动编码器一章中的l-bfgs函数,已经梯度函数验证函数和手写体数字的数据以及读取函数

softmaxCost

function [cost, grad] = softmaxCost(theta, numClasses, inputSize, lambda, data, labels)

% numClasses - the number of classes 
% inputSize - the size N of the input vector
% lambda - weight decay parameter
% data - the N x M input matrix, where each column data(:, i) corresponds to
%        a single test set
% labels - an M x 1 matrix containing the labels corresponding for the input data
%

% Unroll the parameters from theta
theta = reshape(theta, numClasses, inputSize);

numCases = size(data, 2);

groundTruth = full(sparse(labels, 1:numCases, 1));
cost = 0;

thetagrad = zeros(numClasses, inputSize);

%% ---------- YOUR CODE HERE --------------------------------------
%  Instructions: Compute the cost and gradient for softmax regression.
%                You need to compute thetagrad and cost.
%                The groundTruth matrix might come in handy.

%计算h(x)
hypothesis = zeros(numClasses, inputSize);
hypothesis = theta * data;
hypothesis = bsxfun(@minus, hypothesis, max(hypothesis, [], 1));
hypothesis = exp(hypothesis);
hypothesis = bsxfun(@rdivide, hypothesis, sum(hypothesis));
%计算代价函数
theta2 = theta .^ 2;
weight_decay = 0.5 * lambda * sum(theta2(:));
log_h = groundTruth .* log(hypothesis);
cost = - sum(log_h(:)) ./ numCases + weight_decay;
%计算梯度
thetagrad = - (groundTruth - hypothesis) * data' ./ numCases + lambda * theta;

% ------------------------------------------------------------------
% Unroll the gradient matrices into a vector for minFunc
grad = [thetagrad(:)];
end

softmaxPredict函数

function [pred] = softmaxPredict(softmaxModel, data)

% softmaxModel - model trained using softmaxTrain
% data - the N x M input matrix, where each column data(:, i) corresponds to
%        a single test set
%
% Your code should produce the prediction matrix 
% pred, where pred(i) is argmax_c P(y(c) | x(i)).
 
% Unroll the parameters from theta
theta = softmaxModel.optTheta;  % this provides a numClasses x inputSize matrix
pred = zeros(1, size(data, 2));

%% ---------- YOUR CODE HERE --------------------------------------
%  Instructions: Compute pred using theta assuming that the labels start 
%                from 1.

hypothesis = exp(theta * data);
[a,pred] = max(hypothesis);

% ---------------------------------------------------------------------

end


你可能感兴趣的:(机器学习)