=========================================================================================
最近一直在看Deep Learning,各类博客、论文看得不少
但是说实话,这样做有些疏于实现,一来呢自己的电脑也不是很好,二来呢我目前也没能力自己去写一个toolbox
只是跟着Andrew Ng的UFLDL tutorial 写了些已有框架的代码(这部分的代码见github)
后来发现了一个matlab的Deep Learning的toolbox,发现其代码很简单,感觉比较适合用来学习算法
再一个就是matlab的实现可以省略掉很多数据结构的代码,使算法思路非常清晰
所以我想在解读这个toolbox的代码的同时来巩固自己学到的,同时也为下一步的实践打好基础
(本文只是从代码的角度解读算法,具体的算法理论步骤还是需要去看paper的
我会在文中给出一些相关的paper的名字,本文旨在梳理一下算法过程,不会深究算法原理和公式)
==========================================================================================
使用的代码:DeepLearnToolbox ,下载地址:点击打开,感谢该toolbox的作者
==========================================================================================
今天是CNN的内容啦,CNN讲起来有些纠结,你可以事先看看convolution和pooling(subsampling),还有这篇:tornadomeet的博文
下面是那张经典的图:
======================================================================================================
打开\tests\test_example_CNN.m一观
- cnn.layers = {
- struct('type', 'i') %input layer
- struct('type', 'c', 'outputmaps', 6, 'kernelsize', 5) %convolution layer
- struct('type', 's', 'scale', 2) %sub sampling layer
- struct('type', 'c', 'outputmaps', 12, 'kernelsize', 5) %convolution layer
- struct('type', 's', 'scale', 2) %subsampling layer
- };
- cnn = cnnsetup(cnn, train_x, train_y);
- opts.alpha = 1;
- opts.batchsize = 50;
- opts.numepochs = 1;
- cnn = cnntrain(cnn, train_x, train_y, opts);
似乎这次要复杂了一些啊,首先是layer,有三种,i是input,c是convolution,s是subsampling
'c'的outputmaps是convolution之后有多少张图,比如上(最上那张经典的))第一层convolution之后就有六个特征图
'c'的kernelsize 其实就是用来convolution的patch是多大
's'的scale就是pooling的size为scale*scale的区域
接下来似乎就是常规思路了,cnnsetup()和cnntrain()啦,我们来看代码
\CNN\cnnsetup.m
主要是一些参数的作用的解释,详细的参看代码里的注释啊
- function net = cnnsetup(net, x, y)
- inputmaps = 1;
- mapsize = size(squeeze(x(:, :, 1)));
-
-
- for l = 1 : numel(net.layers)
- if strcmp(net.layers{l}.type, 's')
- mapsize = mapsize / net.layers{l}.scale;
-
-
- assert(all(floor(mapsize)==mapsize), ['Layer ' num2str(l) ' size must be integer. Actual: ' num2str(mapsize)]);
- for j = 1 : inputmaps
- net.layers{l}.b{j} = 0;
- end
- end
- if strcmp(net.layers{l}.type, 'c')
- mapsize = mapsize - net.layers{l}.kernelsize + 1;
-
- fan_out = net.layers{l}.outputmaps * net.layers{l}.kernelsize ^ 2;
-
- for j = 1 : net.layers{l}.outputmaps
- fan_in = inputmaps * net.layers{l}.kernelsize ^ 2;
-
- for i = 1 : inputmaps
- net.layers{l}.k{i}{j} = (rand(net.layers{l}.kernelsize) - 0.5) * 2 * sqrt(6 / (fan_in + fan_out));
- end
- net.layers{l}.b{j} = 0;
- end
- inputmaps = net.layers{l}.outputmaps;
- end
- end
-
-
-
-
- fvnum = prod(mapsize) * inputmaps;
- onum = size(y, 1);
-
- net.ffb = zeros(onum, 1);
- net.ffW = (rand(onum, fvnum) - 0.5) * 2 * sqrt(6 / (onum + fvnum));
- end
\CNN\cnntrain.m
cnntrain就和nntrain是一个节奏了:
- net = cnnff(net, batch_x);
- net = cnnbp(net, batch_y);
- net = cnnapplygrads(net, opts);
cnntrain是用back propagation来计算gradient的,我们一次来看这三个函数:
cnnff.m
这部分计算还比较简单,可以说是有迹可循,大家最好看看 tornadomeet的博文的步骤,说得比较清楚
- function net = cnnff(net, x)
- n = numel(net.layers);
- net.layers{1}.a{1} = x;
- inputmaps = 1;
-
- for l = 2 : n
- if strcmp(net.layers{l}.type, 'c')
-
- for j = 1 : net.layers{l}.outputmaps
-
- z = zeros(size(net.layers{l - 1}.a{1}) - [net.layers{l}.kernelsize - 1 net.layers{l}.kernelsize - 1 0]);
- for i = 1 : inputmaps
-
-
- z = z + convn(net.layers{l - 1}.a{i}, net.layers{l}.k{i}{j}, 'valid');
- end
-
-
- net.layers{l}.a{j} = sigm(z + net.layers{l}.b{j});
- end
-
- inputmaps = net.layers{l}.outputmaps;
- elseif strcmp(net.layers{l}.type, 's')
-
- for j = 1 : inputmaps
-
-
- z = convn(net.layers{l - 1}.a{j}, ones(net.layers{l}.scale) / (net.layers{l}.scale ^ 2), 'valid');
- net.layers{l}.a{j} = z(1 : net.layers{l}.scale : end, 1 : net.layers{l}.scale : end, :);
- end
- end
- end
-
-
- net.fv = [];
- for j = 1 : numel(net.layers{n}.a)
- sa = size(net.layers{n}.a{j});
- net.fv = [net.fv; reshape(net.layers{n}.a{j}, sa(1) * sa(2), sa(3))];
- end
-
- net.o = sigm(net.ffW * net.fv + repmat(net.ffb, 1, size(net.fv, 2)));
-
- end
cnnbp.m
这个就哭了,代码有些纠结,不得已又找资料看啊,《Notes on Convolutional Neural Networks》要好一些
只是这个toolbox的代码和《Notes on Convolutional Neural Networks》里有些不一样的是这个toolbox在subsampling(也就是pooling层)没有加sigmoid激活函数,只是单纯地pooling了一下,所以这地方还需仔细辨别,这个toolbox里的subsampling是不用计算gradient的,而在Notes里是计算了的
还有这个toolbox没有Combinations of Feature Maps,也就是tornadomeet的博文里这张表格:
具体就去看看上面这篇论文吧
然后就看代码吧:
- function net = cnnbp(net, y)
- n = numel(net.layers);
-
- net.e = net.o - y;
-
- net.L = 1/2* sum(net.e(:) .^ 2) / size(net.e, 2);
-
-
-
- net.od = net.e .* (net.o .* (1 - net.o));
- net.fvd = (net.ffW' * net.od);
- if strcmp(net.layers{n}.type, 'c')
- net.fvd = net.fvd .* (net.fv .* (1 - net.fv));
- end
-
-
-
- sa = size(net.layers{n}.a{1});
- fvnum = sa(1) * sa(2);
- for j = 1 : numel(net.layers{n}.a)
- net.layers{n}.d{j} = reshape(net.fvd(((j - 1) * fvnum + 1) : j * fvnum, :), sa(1), sa(2), sa(3));
- end
-
-
-
-
-
- for l = (n - 1) : -1 : 1
- if strcmp(net.layers{l}.type, 'c')
- for j = 1 : numel(net.layers{l}.a)
- net.layers{l}.d{j} = net.layers{l}.a{j} .* (1 - net.layers{l}.a{j}) .* (expand(net.layers{l + 1}.d{j}, [net.layers{l + 1}.scale net.layers{l + 1}.scale 1]) / net.layers{l + 1}.scale ^ 2);
- end
- elseif strcmp(net.layers{l}.type, 's')
- for i = 1 : numel(net.layers{l}.a)
- z = zeros(size(net.layers{l}.a{1}));
- for j = 1 : numel(net.layers{l + 1}.a)
- z = z + convn(net.layers{l + 1}.d{j}, rot180(net.layers{l + 1}.k{i}{j}), 'full');
- end
- net.layers{l}.d{i} = z;
- end
- end
- end
-
-
- for l = 2 : n
- if strcmp(net.layers{l}.type, 'c')
- for j = 1 : numel(net.layers{l}.a)
- for i = 1 : numel(net.layers{l - 1}.a)
- net.layers{l}.dk{i}{j} = convn(flipall(net.layers{l - 1}.a{i}), net.layers{l}.d{j}, 'valid') / size(net.layers{l}.d{j}, 3);
- end
- net.layers{l}.db{j} = sum(net.layers{l}.d{j}(:)) / size(net.layers{l}.d{j}, 3);
- end
- end
- end
-
- net.dffW = net.od * (net.fv)' / size(net.od, 2);
- net.dffb = mean(net.od, 2);
-
- function X = rot180(X)
- X = flipdim(flipdim(X, 1), 2);
- end
- end
cnnapplygrads.m
这部分就轻松了,已经有grads了,依次进行梯度更新就好了
- function net = cnnapplygrads(net, opts)
- for l = 2 : numel(net.layers)
- if strcmp(net.layers{l}.type, 'c')
- for j = 1 : numel(net.layers{l}.a)
- for ii = 1 : numel(net.layers{l - 1}.a)
- net.layers{l}.k{ii}{j} = net.layers{l}.k{ii}{j} - opts.alpha * net.layers{l}.dk{ii}{j};
- end
- net.layers{l}.b{j} = net.layers{l}.b{j} - opts.alpha * net.layers{l}.db{j};
- end
- end
- end
-
- net.ffW = net.ffW - opts.alpha * net.dffW;
- net.ffb = net.ffb - opts.alpha * net.dffb;
- end
cnntest.m
好吧,我们得知道最后结果怎么来啊
- function [er, bad] = cnntest(net, x, y)
-
- net = cnnff(net, x);
- [~, h] = max(net.o);
- [~, a] = max(y);
- bad = find(h ~= a);
-
- er = numel(bad) / size(y, 2);
- end
就是这样~~cnnff一次后net.o就是结果
总结
just code !
这是一个89年的模型啊~~~,最近还和RBM结合起来了,做了一个Imagenet的最好成绩(是这个吧?):
Alex Krizhevsky.ImageNet Classification with Deep Convolutional Neural Networks. Video and Slides, 2012
http://www.cs.utoronto.ca/~rsalakhu/papers/dbm.pdf
【参考】:
【Deep learning:三十八(Stacked CNN简单介绍)】
【UFLDL】
【Notes on Convolutional Neural Networks】
【Convolutional Neural Networks (LeNet)】 这是deeplearning 的theano库的