Caffe修炼之路(3)——自己编写卷积神经网络

Caffe修炼之路(3)——自己编写卷积神经网络

  • (一)前 言
  • (二)网络proto文件编写
  • (三)Solver配置
  • (四)开始训练

Caffe修炼之路(3)——自己编写卷积神经网络_第1张图片

(一)前 言

上一节我们跑了Caffe自带的MNIST示例,感觉用Caffe训练还是挺简单的,不由得说一句:
Caffe修炼之路(3)——自己编写卷积神经网络_第2张图片
这一节就开始自己编写神经网络了,数据就暂时使用上一节下载好的MNIST数据,之后再自己制作数据,ok我们开始吧!

(二)网络proto文件编写

这就是写你的神经网络的结构,由若干个layer字段组成,以卷积层为例:

layer {
  name: "conv1"  这一层的名字
  type: "Convolution" 层的类型,这里是卷积
  bottom: "data" 输入数据的来源
  top: "conv1" 表示输出结果的数据保存到存储结构conv1中
  param { 所有层共用的内容,学习率之类的
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param { 可选参数,卷积核的大小,滑动步长之类的,之后再细讲
    num_output: 20
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}

我们来尝试定义一个含有两个卷积层的卷积网络,并加入BN层,命名为network.prototxt:

name: "my_net"
# 加载训练数据
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "examples/mnist/mnist_train_lmdb"
    batch_size: 64
    backend: LMDB
  }
}
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "examples/mnist/mnist_test_lmdb"
    batch_size: 100
    backend: LMDB
  }
}
# 定义网络结构
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
  }
  convolution_param {
    num_output: 20
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer{
  name: "bn_1"
  type: "BatchNorm"
  bottom: "pool1"
  top: "bn_1"
  param {
      lr_mult: 0
    }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "bn_1"
  top: "conv2"
  param {
    lr_mult: 1
  }
  convolution_param {
    num_output: 50
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer{
  name: "bn_2"
  type: "BatchNorm"
  bottom: "pool2"
  top: "bn_2"
  param {
      lr_mult: 0
    }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "bn_2"
  top: "ip1"
  param {
    lr_mult: 1
  }
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
# 定义精度和损失函数
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}

(三)Solver配置

接下来定义solver.prototxt文件,在这里定义相关训练参数:

net: "examples/exercise/network.prototxt"
test_iter: 100
# 每训练500次进行一次测试
test_interval: 500
# 基础学习率、上一次更新的权重、权重衰减
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
# 学习率改变策略
lr_policy: "inv"
gamma: 0.0001
power: 0.75
# 每多少次显示训练变化数据
display: 100
# 最大迭代次数
max_iter: 10000
# 每迭代多少次创建训练快照
snapshot: 5000
snapshot_prefix: "examples/mnist/lenet"
# 用CPU还是GPU训练
solver_mode: CPU

(四)开始训练

在caffe-master目录下键入:

build/tools/caffe train --solver=examples/exercise/solver.prototxt

输出:

I0703 15:15:15.058523 47871 solver.cpp:239] Iteration 980 (12.5313 iter/s, 0.798s/10 iters), loss = 0.0946411
I0703 15:15:15.058604 47871 solver.cpp:258]     Train net output #0: loss = 0.0946413 (* 1 = 0.0946413 loss)
I0703 15:15:15.058629 47871 sgd_solver.cpp:112] Iteration 980, lr = 0.00932284
I0703 15:15:15.866871 47871 solver.cpp:239] Iteration 990 (12.3762 iter/s, 0.808s/10 iters), loss = 0.0414168
I0703 15:15:15.866932 47871 solver.cpp:258]     Train net output #0: loss = 0.0414169 (* 1 = 0.0414169 loss)
I0703 15:15:15.866943 47871 sgd_solver.cpp:112] Iteration 990, lr = 0.00931648
I0703 15:15:16.583163 47871 solver.cpp:464] Snapshotting to binary proto file examples/mnist/lenet_iter_1000.caffemodel
I0703 15:15:16.591723 47871 sgd_solver.cpp:284] Snapshotting solver state to binary proto file examples/mnist/lenet_iter_1000.solverstate
I0703 15:15:16.630581 47871 solver.cpp:327] Iteration 1000, loss = 0.0563128
I0703 15:15:16.630635 47871 solver.cpp:347] Iteration 1000, Testing net (#0)
I0703 15:15:17.115231 47871 solver.cpp:414]     Test net output #0: accuracy = 0.978
I0703 15:15:17.115291 47871 solver.cpp:414]     Test net output #1: loss = 0.0601787 (* 1 = 0.0601787 loss)
I0703 15:15:17.115298 47871 solver.cpp:332] Optimization Done.
I0703 15:15:17.115303 47871 caffe.cpp:250] Optimization Done.

我这用的CPU版本,不得不说Caffe训练起来还是挺快的,毕竟是C++写的,C++牛逼!!!感觉TensorFlow要抛弃我这个粉丝了。。。。。。

你可能感兴趣的:(Caffe修炼之路)