Notes on MatConvNet ( I ) --- Overview

Written before

Er, yes, I am introducing the famous CNN framework based totally on matlab language. It is not easy for my limited power and embarrassing English. Never mind and let’s just get started now!

Introduction

First here goes the link for you to have a overview of it http://www.vlfeat.org/matconvnet/
I will just take the version matconvnet-1.0-beta18 which is the latest now for my notes. Here I take image recognition for examples with the data cifar-10.
Downloading and installing are omitted.
I will mainly show you four main scripts, with which you can master the whole to some extent. These four scripts are ‘cnn_cifar_init’, ‘cnn_cifar’, ‘cnn_train’ and ‘vl_simplenn’. I’ ll take ‘cifar’ to show the exciting things.

Links of the four main scripts

In fact, when you have finished compiled and just run ‘cnn_cifar’, then just sit down waiting for results with a cup of tea. Yes, quite simple! Then how it actually works ?

cnn_cifar

cnn_cifar calls cnn_cifar_init to initialize cnn model and gets data from the website(if it makes sure that you haven’t downloaded the data and put the data into the specific folder in advance. ) And after it finds that all the data have been prepared, it does pre-processing ,which includes but not limited to whitedata, contrastNormalization . If it detects that the imdb exits, it loads the data and transfer it to the trainfn(net,imdb,…) function. As can be inferred from the name, trainfn is the training center in this script. trainfn represents different two functions cnn_train or cnn_train_dag. As I only take simple cnn for examples, that is to say cnn_train equals to trainfn at the moment.

cnn_cifar_init

cnn_cifar_init is just a script where you can design your model arbitrarily. Of course, do not be too far or you perhaps cannot get good results. It’s easy to understand what it functions. Just create a cnn model beforehand to be called in cnn_cifar.

vl_simplenn

Before I introduce the script, I would like share my understanding of BP in the matconvnet framework. If you find any mistakes, contact me please.

BP in MatConvNet

Before reading it, I recommand you to take a look at http://blog.csdn.net/hungryof/article/details/50436231 . If you don’t want to waste your time doing so, never mind. You asked for it ~_~.
First, let’s look at the picture below.

This picture shows everything. As you can see, the data transfers from one layer to next layer with wl working on. Every time, xl functions with wl to get the new result xl+1 . fl may represent conv,pool,softmaxloss,relu, or anything you added to the layer. When in forward propagation, it just calculates the results yl (or written as xl+1 with more intention to show as the input of the next layer).
gl+1=fl+1fL . That is to say you take all the part behind fl as a rename part called gl+1 . If you have calculated gl+1xl+1 , then use the chain rule, you can get

glxl=gl+1xl+1flxl
;
glwl=gl+1xl+1flwl

Note: you should know it clearly that which part gl represents for. Yes, as clever you can know, it combines the fl with gl+1 .
Compare them with each other, you can draw this conclusion: if you have gotten the derivative of the latter part with respect to its input (for example, the latter part is gl+1 and the input is xl+1 ), you can just separately figure out the derivatives of a new part combined with the previous layer with respect to the input( xl ) and the weights ( wl ).
Here are some words that may help you to some extent:
When in backpropagation, what every layer has are the value in its ‘tail’ and its input. It just computes the local derivative with respect to its input and maybe as well as to its weights. Take the previous example, when it comes to the layer l , what should it do? It wants to get the derivatives of the ‘tail’ value with respect to xl and wl , luckily, you have figured out the derivative of the final output with respect to xl+1 , so the layer l just takes those two values to calculate glwl and glxl . Read at least three times if you are not quite familiar with what happens in BP.

Interpretation of vl_simplenn

Hah, maybe the word interpretation is a little bit unfitted. Just let it go.
Let’s continue. When you glance over the script, you’ll find the structure is quite simple. In the forward pass, just calculate the result as the input of the next layer. On the contrary, in the backward pass it computes the derivatives of the ‘tail’ value with respect to that layers properties ,such as the weights or the input.
Note, it calls lots of functions, such as vl_nnconv, vl_convt, vl_pool, vl_nnsoftmax, vl_nnloss, etc. These functions are mainly coded with cuda C for fast computation and advanced optimization. You can see the codes in matconvnet-1.0-beta18\matlab\src. Scripts with ‘.cu’ extension are what you are looking for. When you compile it, you can get files like vl_nnconv.mexw64 . These are what matlab indeed invokes. As how to compile, you’d better refer to the website or my blog http://blog.csdn.net/hungryof/article/details/50788722 on the basis of compiling your matconvnet on Windows.
More details will be shown in the next blog.

cnn_train

As the annotation goes,

CNN_TRAIN An example implementation of SGD for training CNNs

This is the central part of all the four scripts, as the lines are the longest..Er, just a joke~~ However, it plays a role for somehow a connecting link as it functions with the script ‘simplenn’ as well as ‘cnn_cifar’. Rather , cnn_train is an example learner implementing stochastic gradient descent with momentum to train a CNN. Different datasets and tasks by a suitable getBatch function will work well. What to be more powerful is that it automatically restarts where it was interrupted last time by checkpointing.

The main procedure is:
1. Set opts to be trained. As the script is mainly for different datasets, so it has some initial values. When have gotten the new opts from varargin, it updates the opts, then opts is finally suitable for your training. vl_argparse is in charge of opts updating.
2. Adjust the model with the updated opts. You’ll see the model is totally updated to be what the parameter varargin sets to be.
3. Train and validate. This part includes restarting by checkpointing, training and validating. It calls the vl_simplenn in its inner function with the type [net, stats.train, prof] = process_epoch(…) in training procedure and with the type [~, stats.val] = process_epoch(…) in validation procedure.
4. Draw and print some information about the epoch, mainly the processing rate, top1err and top5err .

Postscript

I do believe you have known something about the framework. As I am not completely familiar with matconvnet, I may be trapped in misunderstanding in specific codes. So The next blog will record my low level understanding of the concrete codes. Yes, mainly the four scripts. Perhaps, other functions will also be attached, including vl_argparse or something else. Concrete functions will be attached if they are indeed necessary to be represented. Finally, I am looking forward to your feedback on mistakes you’ve found or suggestions.

你可能感兴趣的:(matlab,cnn,matconvnet)