斯坦福的UFLDL教程每一个章节都配有练习
本文是softmax这一章节的练习的解答
练习的目的是学习softmax算法的基本知识,并且训练一个softmax手写体数字识别器
具体内容可以浏览课程网页
1.matlab是从1开始计数的所以为了方便起见,将所有数字0的标记变为10
2.注意利用稀疏矩阵函数sparse和full生成ground_truth
3.为了防止指数函数的上溢,将所有的指数项都减去其最大值,这并不影响分类的结果
4.善于利用bsxfun函数有助于减少代码量
5.需要利用到稀疏自动编码器一章中的l-bfgs函数,已经梯度函数验证函数和手写体数字的数据以及读取函数
function [cost, grad] = softmaxCost(theta, numClasses, inputSize, lambda, data, labels) % numClasses - the number of classes % inputSize - the size N of the input vector % lambda - weight decay parameter % data - the N x M input matrix, where each column data(:, i) corresponds to % a single test set % labels - an M x 1 matrix containing the labels corresponding for the input data % % Unroll the parameters from theta theta = reshape(theta, numClasses, inputSize); numCases = size(data, 2); groundTruth = full(sparse(labels, 1:numCases, 1)); cost = 0; thetagrad = zeros(numClasses, inputSize); %% ---------- YOUR CODE HERE -------------------------------------- % Instructions: Compute the cost and gradient for softmax regression. % You need to compute thetagrad and cost. % The groundTruth matrix might come in handy. %计算h(x) hypothesis = zeros(numClasses, inputSize); hypothesis = theta * data; hypothesis = bsxfun(@minus, hypothesis, max(hypothesis, [], 1)); hypothesis = exp(hypothesis); hypothesis = bsxfun(@rdivide, hypothesis, sum(hypothesis)); %计算代价函数 theta2 = theta .^ 2; weight_decay = 0.5 * lambda * sum(theta2(:)); log_h = groundTruth .* log(hypothesis); cost = - sum(log_h(:)) ./ numCases + weight_decay; %计算梯度 thetagrad = - (groundTruth - hypothesis) * data' ./ numCases + lambda * theta; % ------------------------------------------------------------------ % Unroll the gradient matrices into a vector for minFunc grad = [thetagrad(:)]; end
function [pred] = softmaxPredict(softmaxModel, data) % softmaxModel - model trained using softmaxTrain % data - the N x M input matrix, where each column data(:, i) corresponds to % a single test set % % Your code should produce the prediction matrix % pred, where pred(i) is argmax_c P(y(c) | x(i)). % Unroll the parameters from theta theta = softmaxModel.optTheta; % this provides a numClasses x inputSize matrix pred = zeros(1, size(data, 2)); %% ---------- YOUR CODE HERE -------------------------------------- % Instructions: Compute pred using theta assuming that the labels start % from 1. hypothesis = exp(theta * data); [a,pred] = max(hypothesis); % --------------------------------------------------------------------- end