GitHub链接:DeepLearnToolbox
A Matlab toolbox for Deep Learning.
Deep Learning is a new subfield of machine learning that focuses on learning deep hierarchical models of data. It is inspired by the human brain's apparent deep (layered, hierarchical) architecture. A good overview of the theory of Deep Learning theory is Learning Deep Architectures for AI
NN/
- A library for Feedforward Backpropagation Neural Networks
CNN/
- A library for Convolutional Neural Networks
DBN/
- A library for Deep Belief Networks
SAE/
- A library for Stacked Auto-Encoders
CAE/
- A library for Convolutional Auto-Encoders
util/
- Utility functions used by the libraries
data/
- Data used by the examples
tests/
- unit tests to verify toolbox is working
For references on each library check REFS.md
Windows下把文件夹加入 path 即可
%LiFeiteng path = pwd; files = dir(path); for i = 3:length(files) if files(i).isdir file = files(i).name; addpath([path '/' file]) disp(['add ' file ' to path!']) end end
我不打算解析代码,想从代码里面学算法是stupid的;有相应的论文,readlist,talk等可以去学习。
DeepLearnToolbox单隐藏层NN的优化策略:mini-Batch SGD
function [nn, L] = nntrain(nn, train_x, train_y, opts, val_x, val_y) %NNTRAIN trains a neural net % [nn, L] = nnff(nn, x, y, opts) trains the neural network nn with input x and % output y for opts.numepochs epochs, with minibatches of size % opts.batchsize. Returns a neural network nn with updated activations, % errors, weights and biases, (nn.a, nn.e, nn.W, nn.b) and L, the sum % squared error for each training minibatch. assert(isfloat(train_x), 'train_x must be a float'); assert(nargin == 4 || nargin == 6,'number ofinput arguments must be 4 or 6') loss.train.e = []; loss.train.e_frac = []; loss.val.e = []; loss.val.e_frac = []; opts.validation = 0; if nargin == 6 opts.validation = 1; end fhandle = []; if isfield(opts,'plot') && opts.plot == 1 fhandle = figure(); end m = size(train_x, 1); batchsize = opts.batchsize; numepochs = opts.numepochs; numbatches = m / batchsize; assert(rem(numbatches, 1) == 0, 'numbatches must be a integer'); L = zeros(numepochs*numbatches,1); n = 1; for i = 1 : numepochs tic; kk = randperm(m); for l = 1 : numbatches batch_x = train_x(kk((l - 1) * batchsize + 1 : l * batchsize), :); %Add noise to input (for use in denoising autoencoder) if(nn.inputZeroMaskedFraction ~= 0) batch_x = batch_x.*(rand(size(batch_x))>nn.inputZeroMaskedFraction); end batch_y = train_y(kk((l - 1) * batchsize + 1 : l * batchsize), :); nn = nnff(nn, batch_x, batch_y); nn = nnbp(nn); nn = nnapplygrads(nn); L(n) = nn.L; n = n + 1; end t = toc; if ishandle(fhandle) if opts.validation == 1 loss = nneval(nn, loss, train_x, train_y, val_x, val_y); else loss = nneval(nn, loss, train_x, train_y); end nnupdatefigures(nn, fhandle, loss, opts, i); end disp(['epoch ' num2str(i) '/' num2str(opts.numepochs) '. Took ' num2str(t) ' seconds' '. Mean squared error on training set is ' num2str(mean(L((n-numbatches):(n-1))))]); nn.learningRate = nn.learningRate * nn.scaling_learningRate; end end
1.不管是在 nntrain、 nnbp还是nnapplygrads中我都没看到 对算法收敛性的判断,
而且在实测的过程中 有观察到 epoch过程中 mean-squared-error有 下降-上升-下降 的走势——微小抖动在SGD中 算是正常
多数还都是在下降(epoch我一般设为 10-40,这个值可能偏小;Hinton 06 science的文章代码记得epoch了200次,我跑了3天也没跑完)
在SAE/CNN等中 也没看到收敛性的判断。
2.CAE 没有完成
3.dropout的优化策略也可以选择
我测试了 SAE CNN等,多几次epoch(20-30),在MNIST上正确率在 97%+的样子。
其实cost-function 可以有不同的选择,如果使用 UFLDL的优化方式(固定的优化方法,传入cost-function的函数句柄),在更改cost-function上会更自由。
可以改进的地方:
1. mini-Bathch SGD算法 增加收敛性判断
2.增加 L-BFGS/CG等优化算法
3.完善CAE等
4.增加min KL-熵的 Sparse Autoencoder等
5.优化算法增加对 不同cost-function的支持