上一节我们跑了Caffe自带的MNIST示例,感觉用Caffe训练还是挺简单的,不由得说一句:
这一节就开始自己编写神经网络了,数据就暂时使用上一节下载好的MNIST数据,之后再自己制作数据,ok我们开始吧!
这就是写你的神经网络的结构,由若干个layer字段组成,以卷积层为例:
layer {
name: "conv1" 这一层的名字
type: "Convolution" 层的类型,这里是卷积
bottom: "data" 输入数据的来源
top: "conv1" 表示输出结果的数据保存到存储结构conv1中
param { 所有层共用的内容,学习率之类的
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param { 可选参数,卷积核的大小,滑动步长之类的,之后再细讲
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
我们来尝试定义一个含有两个卷积层的卷积网络,并加入BN层,命名为network.prototxt:
name: "my_net"
# 加载训练数据
layer {
name: "mnist"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
scale: 0.00390625
}
data_param {
source: "examples/mnist/mnist_train_lmdb"
batch_size: 64
backend: LMDB
}
}
layer {
name: "mnist"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
scale: 0.00390625
}
data_param {
source: "examples/mnist/mnist_test_lmdb"
batch_size: 100
backend: LMDB
}
}
# 定义网络结构
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
}
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer{
name: "bn_1"
type: "BatchNorm"
bottom: "pool1"
top: "bn_1"
param {
lr_mult: 0
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "bn_1"
top: "conv2"
param {
lr_mult: 1
}
convolution_param {
num_output: 50
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer{
name: "bn_2"
type: "BatchNorm"
bottom: "pool2"
top: "bn_2"
param {
lr_mult: 0
}
}
layer {
name: "ip1"
type: "InnerProduct"
bottom: "bn_2"
top: "ip1"
param {
lr_mult: 1
}
inner_product_param {
num_output: 500
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "ip1"
top: "ip1"
}
layer {
name: "ip2"
type: "InnerProduct"
bottom: "ip1"
top: "ip2"
param {
lr_mult: 1
}
inner_product_param {
num_output: 10
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
# 定义精度和损失函数
layer {
name: "accuracy"
type: "Accuracy"
bottom: "ip2"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "ip2"
bottom: "label"
top: "loss"
}
接下来定义solver.prototxt文件,在这里定义相关训练参数:
net: "examples/exercise/network.prototxt"
test_iter: 100
# 每训练500次进行一次测试
test_interval: 500
# 基础学习率、上一次更新的权重、权重衰减
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
# 学习率改变策略
lr_policy: "inv"
gamma: 0.0001
power: 0.75
# 每多少次显示训练变化数据
display: 100
# 最大迭代次数
max_iter: 10000
# 每迭代多少次创建训练快照
snapshot: 5000
snapshot_prefix: "examples/mnist/lenet"
# 用CPU还是GPU训练
solver_mode: CPU
在caffe-master目录下键入:
build/tools/caffe train --solver=examples/exercise/solver.prototxt
输出:
I0703 15:15:15.058523 47871 solver.cpp:239] Iteration 980 (12.5313 iter/s, 0.798s/10 iters), loss = 0.0946411
I0703 15:15:15.058604 47871 solver.cpp:258] Train net output #0: loss = 0.0946413 (* 1 = 0.0946413 loss)
I0703 15:15:15.058629 47871 sgd_solver.cpp:112] Iteration 980, lr = 0.00932284
I0703 15:15:15.866871 47871 solver.cpp:239] Iteration 990 (12.3762 iter/s, 0.808s/10 iters), loss = 0.0414168
I0703 15:15:15.866932 47871 solver.cpp:258] Train net output #0: loss = 0.0414169 (* 1 = 0.0414169 loss)
I0703 15:15:15.866943 47871 sgd_solver.cpp:112] Iteration 990, lr = 0.00931648
I0703 15:15:16.583163 47871 solver.cpp:464] Snapshotting to binary proto file examples/mnist/lenet_iter_1000.caffemodel
I0703 15:15:16.591723 47871 sgd_solver.cpp:284] Snapshotting solver state to binary proto file examples/mnist/lenet_iter_1000.solverstate
I0703 15:15:16.630581 47871 solver.cpp:327] Iteration 1000, loss = 0.0563128
I0703 15:15:16.630635 47871 solver.cpp:347] Iteration 1000, Testing net (#0)
I0703 15:15:17.115231 47871 solver.cpp:414] Test net output #0: accuracy = 0.978
I0703 15:15:17.115291 47871 solver.cpp:414] Test net output #1: loss = 0.0601787 (* 1 = 0.0601787 loss)
I0703 15:15:17.115298 47871 solver.cpp:332] Optimization Done.
I0703 15:15:17.115303 47871 caffe.cpp:250] Optimization Done.
我这用的CPU版本,不得不说Caffe训练起来还是挺快的,毕竟是C++写的,C++牛逼!!!感觉TensorFlow要抛弃我这个粉丝了。。。。。。