

caffe的单元 Blobs, layers, nets

- Blob

Blob作为caffe的存储及通信单元,是一个要被处理的真实数据(i.e. image batches, model parameters, derivatives for optimization)的封装。

Blob存储图像batch的方式为 (数目N)×(通道K)×(高H)×(宽W),以行优先方式存储,也就是说,`(n,k,h,w)`物理地址为`((n*K+k)*H+h)*W+w`.

起初, CAFFE 只支持 4-D blob 和 2-D 卷积(NxCxHxW),现在支持 n-D blobs 和 (n-2)-D 卷积。

- Layer

Layers category

- Data layer

  • Image Data
  • Database
  • HDF Input
  • HDF Output
  • Input
  • Window Data
  • Memory Data
  • Dummy Data
  • Python

- Vision layer

  • Convolution

      layer {
          name: "conv1"
          type: "Convolution"
          bottom: "data"
          top: "conv1"
          # learning rate and decay multipliers for the filters
          param { lr_mult: 1 decay_mult: 1 }
          # learning rate and decay multipliers for the biases
          param { lr_mult: 2 decay_mult: 0 }
          convolution_param {
            num_output: 96     # learn 96 filters
            kernel_size: 11    # each filter is 11x11
            stride: 4          # step 4 pixels between each filter application
            pad: 0             # pad up pixels
            weight_filler {
              type: "gaussian" # initialize the filters from a Gaussian
              std: 0.01        # distribution with stdev 0.01 (default mean: 0)
            bias_filler {
              type: "constant" # initialize the biases to zero (0)
              value: 0
    • weight_filter type: caffe 中支持的初始化filter类型有


    默认类型为 constant, 更详细的介绍见 include/caffe/filter.hpp

  • Pooling

      layer {
        name: "pool1"
        type: "Pooling"
        bottom: "conv1"
        top: "pool1"
        pooling_param {
          pool: MAX
          kernel_size: 3 # pool over a 3x3 region
          stride: 2      # step two pixels (in the bottom blob) between pooling regions
  • Spatial Pyramid Pooling(SPP)

  • Local Response Normalization(LRN)

  • Crop

      layer {
          bottom: "A"
          bottom: "B"
          top: "C"
          name: "crop_u1u2"
          type: "Crop"

    CROP层用于裁剪数据。假设A、B层size分别为(20,50,512,512),(20,10,256,256),则输出层C的size是(20,10,256,256), 更详细解释见crop.

  • Im2col

  • Deconvolution layer(transpose convolution)

    same as Convolution layer

- Recurrent Layers

  • Recurrent

  • RNN

  • Long-Short Term Memory(LSTM)

- Common Layers

  • Inner Product - fully connected layer

  • Dropout

  • Embed

- Normalization Layers

  • Local Response Normalization(LRN)

  • Mean Variance Normalization(MVN)

  • Batch Normalization

- Activation Layers

  • ReLU and Leaky-ReLU

  • PReLU

  • ELU

  • Sigmoid

  • TanH

  • Absolute Value

  • Power- f(x)=(shift+scale*x)^power

  • Exp- f(x)=base^(shift+scale*x)

  • Log- f(x)=log(x)

  • BNLL- f(x)=log(1+exp(x))

  • Threshold

  • Bias

  • Scale

- Utility Layers

  • Flatter

  • Reshape

  • Batch Reindex

  • Split

  • Concat

     layer {
         bottom: "A"
         bottom: "B"
         top: "C"
         name: "concat_AB_C"
         type: "Concat"
         concat_param {
             axis: 1

    假设 A、B 的 size 分别为 (n1, c1, h, w), (n2, c2, h, w),如果 axis=0, 则 C 的 size 为(n1+n2, c1, h, w) 且要求 c1=c2; 如果 axis=1, 则 C 的 size 为 (n1, c1+c2, h, 2) 且要求 n1=n2.

  • Slicing

  • Eltwise

    适用于残差学习(Residual Learning),实现 f(x) + x

     layer {
         bottom: "conv10"
         bottom: "conv11"
         top: "Res"
         name: "Res"
         type: "Eltwise"
         eltwise_param {
             op: SUM
             coeff: 1
             coeff: -1
  • Filter/Mask

  • Parameter

  • Reduction

  • Silence

  • ArgMax

  • Softmax

  • Python-allows custom Python layers

- Loss Layers

  • Multinomial Logistic Loss

  • Infogain Loss

  • Softmax with loss

         name: "loss"
         type: "SoftmaxWithLoss"
         bottom: "pred"
         bottom: "label"
         top: "loss"
  • Euclidean

  • Hinge/Margin

  • Sigmoid Cross-Entropy Loss

         name: "loss"
         type: "SigmoidCrossEntropyLoss"
         bottom: "pred"
         bottom: "label"
         top: "loss"
  • Accuracy/Top-k layer

  • Contrastive Loss

Multiple loss layers

事实上,一个网络可以包含很多 loss function, 只要它是一个 DAG (directed acyclic graph)(Caffe net本身可以是任何结构的DAG,不一定是线性结构)。 例如:

    layers {
        name: "recon-loss"
        type: "Euclidean"
        bottom: "reconstructions"
        bottom: "data"
        top: "recon-loss"
    layers {
        name: "class-loss"
        type: "softmaxWithLoss"
        bottom: "class-preds"
        bottom: "class-labels"
        top: "class-loss"
        loss_weight: 100.0

表示的 Loss function 就是:

任何 layer 都可以产生 loss

- Net

    name: "LogReg"
    layer {
      name: "mnist"
      type: "Data"
      top: "data"
      top: "label"
      data_param {
        source: "input_leveldb"
        batch_size: 64
    layer {
      name: "ip"
      type: "InnerProduct"
      bottom: "data"
      top: "ip"
      inner_product_param {
        num_output: 2
    layer {
      name: "loss"
      type: "SoftmaxWithLoss"
      bottom: "ip"
      bottom: "label"
      top: "loss"

- Net visualization


~/caffe/python/ yout_net.prototxt yoursave.png

- Solver

solver_type: SGD

solver_type: NESTEROV

solver_type: ADAGRAD


- Weight sharing


caffe 使用问题集锦

  • Q1   训练过程中人为中断训练(ctrl+C),可否从中断时刻继续训练?

    Answer:  可以。

     caffe train -solver solver.prototxt -snapshot train_1000.solverstate


  • Q2   如何可视化卷积层?

    caffe 与 python 、matlab 接口

- 使用 python 接口可视化卷积层以及做相关的 test ( [jupyter]( ). 不过需要预装 Python 的一些包,如 numpy, scikit-learn等,才能正常 `import caffe`.

        sudo apt-get install python-numpy python-matplotlib python-sklearn python-scipy python-skimage python-h5py python-protobuf python-leveldb python-networkx python-nose python-pandas python-gflags Cython ipython
        sudo apt-get update
参考 [caffe examples](, 我们把 .ipynb 文件转换为 .py 文件并用 ipyhon 执行, 如果用 python 执行会出现 [错误](。

        jupyter nbconvert --to script '00-classfication.ipynb'
但用ipython仍然遇到了一个 ImportError 的错误 


  • Q3   什么样的 layer 才能它的 bottom 和 top 可以是相同的名称?

    Answer:  目前只有 Relu 层它的上下层可以使用相同名称,因为它是 element-wise 的,所以可以使用 in-place 的操作以节省内存。 (具体)


caffe matlab接口使用


    clear; close all;

    % settings
    model = '*.prototxt'
    weights = '*.caffemodel'

    % load model using mat_caffe
    net = caffe.Net(model, weights, 'test');    

在界面中可以看到,net 含有以下参数


caffe python 接口使用


name: "myconvnet"
input: "data"
input_dim: 1
input_dim: 1
input_dim: 256
input_dim: 256

layer {
  name: "conv"
  type: "Convolution"
  bottom: "data"
  top: "conv"
  convolution_param {
    num_output: 10
    kernel_size: 3
    stride: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    bias_filler {
      type: "constant"
      value: 0

为了保证自己定义的网络各层之间的连接没有问题,我们可以将它可视化来检查网络,看它是什么样子的。 这需要安装一些依赖的包

$ pip insall pydot
$ sudo apt-get install graphviz libgraphviz-dev
$ pip install pygraphviz

然后,就可以用 caffe 自带的 python 脚本画出自定义的网络

$ python /path/to/caffe/python/ myconvnet.prototxt myconvnet.png 

打开 myconvnet.png 就可以看到画出的网络

下面说一说怎么用 Python 调用训练好的网络来做测试。首先,创建一个 net 对象来容纳我们的卷积网络:

impoort sys
sys.path.insert(0, '/path/to/caffe/python')
import numpy as np
import cv2
from pylab import *  #画图

import caffe
caffe.set_device(1)     #指定使用哪一块GPU
caffe.set_mode_gpu()    #指定GPU计算

model_def = 'deploy.prototxt'   #给定网络模型
model_weight = 'net.cafffemodel'    #给定参数
net = caffe.Net(model_def, model_weight, caffe.TEST)        #给定phase = TEST, 那么网络只会向前计算,不会 backpropagation
