slice:在某一个维度,按照给定的下标,blob拆分成几块。比如要拆分channel,总数50,下标为10,20,30,40,那就是分成5份,每份10个channel,输出5个layer。
concat:在某个维度,将输入的layer组合起来,是slice的逆过程。
split:将blob复制几份,分别给不同的layer,这些上层layer共享这个blob。
tile:将blob的某个维度,扩大n倍。比如原来是1234,扩大两倍变成11223344。
reduction:将某个维度缩减至1维,方法可以是sum、mean、asum、sumsq。
reshape:这个很简单,就是matlab里的reshape。
eltwise:将几个同样大小的layer,合并为1个,合并方法可以是相加、相乘、取最大。
flatten:将中间某几维合并,其实可以用reshape代替。
1.数据层
#lamb 数据
layer {
name: "left_eye"
type: "Data"
top: "data_left_eye"
include {
phase: TRAIN
}
transform_param {
scale: 0.00390625
}
data_param {
source: "left_eye_regression/lmdb/train_data_npd"
batch_size: 64
backend: LMDB
}
}
参数1
transform_param {
scale: 0.00390625
}
参数2
transform_param {
mean_value: 104
mean_value: 117
mean_value: 124
scale: 0.0078125
}
layer
{
name: "eltwise_layer"
type: "Eltwise"
bottom: "A"
bottom: "B"
top: "diff"
eltwise_param {
operation: SUM
}
}
Eltwise层的操作有三个:product(点乘), sum(相加减) 和 max(取大值),其中sum是默认操作。
layer {
name: "left_eye"
type: "Data"
top: "label_left_eye"
include {
phase: TRAIN
}
data_param {
source: "left_eye_regression/lmdb/train_label_npd"
batch_size: 64
backend: LMDB
}
}
layer {
name: "left_eye"
type: "Data"
top: "data_left_eye"
include {
phase: TEST
}
transform_param {
scale: 0.00390625
}
data_param {
source: "left_eye_regression/lmdb/test_data_npd"
batch_size: 64
backend: LMDB
}
}
layer {
name: "left_eye"
type: "Data"
top: "label_left_eye"
include {
phase: TEST
}
data_param {
source: "left_eye_regression/lmdb/test_label_npd"
batch_size: 64
backend: LMDB
}
}
#hdf5 数据
layer {
name: "data"
type: "HDF5Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
hdf5_data_param {
source: "examples/hdf5_classification/data/train.txt"
batch_size: 10
}
}
layer {
name: "data"
type: "HDF5Data"
top: "data"
top: "label"
include {
phase: TEST
}
hdf5_data_param {
source: "examples/hdf5_classification/data/test.txt"
batch_size: 10
}
}
2.其他层
#卷积层
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
# 滤波器(filters)的学习速率因子和衰减因子 param { lr_mult: 1 decay_mult: 1 } # 偏置项(biases)的学习速率因子和衰减因子 param { lr_mult: 2 decay_mult: 0 }
convolution_param {
num_output: 32
kernel_size: 3
stride: 1
(pad: 2)
weight_filler {
type: "xavier"
(type: "gaussian"
std: 0.0001)
}
bias_filler {
type: "constant"
}
}
}
#池化层
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX/AVE
kernel_size: 2
stride: 2
}
}
#激活层
layer {
name: "relu1"
type: "ReLU"
bottom: "pool1"
top: "pool1"
}
layer {
name: "prelu"
type: "PReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "tanh4"
type: "TanH"
bottom: "ip1"
top: "ip1"
}
#全连接层
layer {
name: "ip1"
type: "InnerProduct"
bottom: "pool3"
top: "ip1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 256
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
#局部响应归一化层
layer {
name: "norm2"
type: "LRN"
bottom: "pool2"
top: "norm2"
lrn_param {
local_size: 3
alpha: 5e-05
beta: 0.75
norm_region: WITHIN_CHANNEL
}
}
#dropout 层
layer {
name: "drop7"
type: "Dropout"
bottom: "fc7-conv"
top: "fc7-conv"
dropout_param {
dropout_ratio: 0.5
}
}
#sigmoid 层
layer {
name: "Sigmoid1"
type: "Sigmoid"
bottom: "pool1"
top: "Sigmoid1"
}
# Softmax 层
layer {
name: "prob"
type: "Softmax"
bottom: "ip1"
top: "prob"
}
#Softmax 损失
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "ip1"
bottom: "label"
top: "loss"
}
# accuracy 准确率
layer {
name: "accuracy"
type: "Accuracy"
bottom: "ip1"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
3.特殊 层
#维度分离层 axis:分离的维度 slice_point: 分层几份(slice_point个数 = top个数 - 1)
不写 slice_point 代表 label1 和label2 均分
slice_point :10 [:10] 选择[0-10]给label1 剩余的给label2
layer {
name: "slice"
type: "Slice"
bottom: "label"
top: "label1"
top: "label2"
slice_param {
axis: 1
slice_point: 10
}
}
# label1 是 [0-10] label2 是 [10-20] label3 是剩余的。
layer {
name: "slice"
type: "Slice"
bottom: "label"
top: "label1"
top: "label2"
top: "label3"
slice_param {
axis: 1
slice_point: 10
slice_point: 20
}
}
# concat 连接 层
layer {
name: "data_all"
type: "Concat"
bottom: "data_left_eye"
bottom: "data_right_eye"
bottom: "data_nose"
bottom: "data_mouth"
top: "data_all"
concat_param {
axis: 1
}
}
layer {
name: "label_all"
type: "Concat"
bottom: "label_left_eye"
bottom: "label_right_eye"
bottom: "label_nose"
bottom: "label_mouth"
top: "label_all"
concat_param {
axis: 1
}
}
#reshape 层
layer {
name: "reshape"
type: "Reshape"
bottom: "input"
top: "output"
reshape_param {
shape {
dim: 0 # copy the dimension from below
dim: 2
dim: 3
dim: -1 # infer it from the other dimensions
}
}
}
Image_data_layer,训练也可以使用该层,该层不用lmdb文件,直接给一个图片的根目录,然后all-sample.txt, 每行为图片的路径和 label . 即生成对于的lmdb时的label格式。
caffe中ImageData layer的图像增强操作
mirror
mirror:ture代表随机的左右翻转。
It is random left-right flipping, a common operating when training models.
contrast_brightness_adjustment
开启或者禁止对比度调节,默认禁止(false)
contrast_brightness_adjustment:true
min_side_min
min_side_min and min_side_max are added for random cropping while keeping the aspect ratio。 as mentioned in “Deep Residual Learning for Image Recognition”(http://arxiv.org/abs/1512.03385)
min_side_min:224
使用了min_side_min和min_side_max就不需要再在image_data_param中设置new_height和new_width两个参数。在这里图片将被随机resize到这个区间内
min_side_max
min_side_max:256
crop_size
crop_size:224
在caffe中,如果定义了crop_size,那么在train时会对大于crop_size的图片进行随机裁剪,而在test时只是截取中间部分
max_rotation_angle
图片最大的旋转角度,默认为0
max_rotation_angle:15
min_contrast
最小的对比度乘子(min alpha),默认0.8
max_contrast
最大对比度乘子(max alpha),默认1.2
max_smooth
最大平滑乘子,默认6
进行高斯平滑
apply_probability
每个操作被执行的概率,默认为0.5
max_color_shift
在RGB轴上最大的色彩偏移
max_color_shift:20
mean_value:
BGR顺序的均值
debug_params
使能或禁止打印操作参数,默认禁止
debug_params:false
min_side
resize & crop 保持纵横比,默认0,disabled
max_brightness_shift
max brightness shift in positive and negative directions (beta), default 5;
smooth_filtering
enable/disable smooth filterion, default false;
layer {
name: "in_shop"
type: "ImageData"
top: "data"
top: "label"
include{
phase: TRAIN
}
transform_param {
mirror: true
contrast_brightness_adjustment: true
min_side_min: 224
min_side_max: 256
crop_size: 224
max_rotation_angle: 15
min_contrast: 0.8
max_contrast: 1.2
max_smooth: 6
apply_probability: 0.5
max_color_shift: 20
mean_value: 104
mean_value: 117
mean_value: 123
debug_params: false
}
image_data_param {
source: "/export/home/dyh/workspace/circle_k/for_douyuhao/all-sample.txt"
batch_size: 128
new_height: 256
new_width: 256
shuffle: true
root_folder: "/export/home/dyh/workspace/circle_k/for_douyuhao/all-images/"
}
}
Caffe 的 BN(BatchNorm ) 层共有三个参数参数:均值、方差和滑动系数,BN层结构如下:
layer {
bottom: "res2a_branch2b"
top: "res2a_branch2b"
name: "bn2a_branch2b"
type: "BatchNorm"
batch_norm_param {
use_global_stats: false //训练阶段和测试阶段不同,
}
include: { phase: TRAIN }
}
layer {
bottom: "res2a_branch2b"
top: "res2a_branch2b"
name: "bn2a_branch2b"
type: "BatchNorm"
batch_norm_param {
use_global_stats: true
}
include: { phase: TEST }
}
BatchNorm"
batch_norm_param {
use_global_stats: false //训练阶段和测试阶段不同,
}
include: { phase: TRAIN }
}
layer {
bottom: "res2a_branch2b"
top: "res2a_branch2b"
name: "bn2a_branch2b"
type: "BatchNorm"
batch_norm_param {
use_global_stats: true
}
include: { phase: TEST }
}
是测试阶段则等价为真,如果是训练阶段则等价为假。
在Caffe中使用Batch Normalization需要注意以下两点:
1. 要配合Scale层一起使用,具体参见http://blog.csdn.net/sunbaigui/article/details/50807398以及Residual Network
2. 训练的时候,将BN层的use_global_stats设置为false,然后测试的时候将use_global_stats设置为true,不然训练的时候会报“NAN”或者模型不收敛。
#batch normilizatiosn
layer {
bottom: "conv1"
top: "conv1"
name: "bn_conv1"
type: "BatchNorm"
batch_norm_param {
use_global_stats: true
}
}
layer {
bottom: "conv1"
top: "conv1"
name: "scale_conv1"
type: "Scale"
scale_param {
bias_term: true
}
}
4.loss 层caffe 常用损失:
MULTINOMIAL_LOGISTIC_LOSS
多分类逻辑损失SIGMOID_CROSS_ENTROPY_LOSS sigmoid
交叉熵损失SOFTMAX_LOSS softmax损失
EUCLIDEAN_LOSS 平方差损失
HINGE_LOSS hinge损失 svm
INFOGAIN_LOSS
#平方差Loss
layer { name: "loss" type: "EuclideanLoss" bottom: "ip2" bottom: "label" top: "loss"}
#多 loss 可以设置权重
layer { name: "loss1" type: "EuclideanLoss" bottom: "out1" bottom: "label1" top: "loss1" loss_weight:0.4}
layer { name: "loss2" type: "EuclideanLoss" bottom: "out2" bottom: "label2" top: "loss2" loss_weight:0.6}
5.测试数据层
layer{ name: "data" type: "MemoryData" top: "data_all" top: "label" memory_data_param { batch_size: 1 channels: 12 height: 36 width: 48 } transform_param { scale: 0.00390625 }}
获得多个图片通道融合后的数据。
Mat mats[4];
vector rect4;
for (int k = 0; k < 4; k++)
{
float x, y; if (k == 3)
{
x = (som.points[k].x + som.points[k + 1].x) / 2.0 * Scale;
y = (som.points[k].y + som.points[k + 1].y) / 2.0 * Scale;
} else {
x = som.points[k].x * Scale;
y = som.points[k].y * Scale;
}
x = x - 24; if (k == 2)
{ y = y - 12; }
else if (k == 3) { y = y - 20;
} else { y = y - 24; }
checkxy(x, y);
Rect rect(x, y, 48, 36);
rect4.push_back(rect);
Mat roi = img(rect);
//imshow("test", roi); //waitKey(); mats[k] = roi; }
vector dd;
Mat all_data(36, 48, CV_8UC(12)); //mats 数组融合 4代表需要融合的图片张数。
merge(mats, 4, all_data);
对比损失,
多用于 siamese net
layer { name: "loss"
type: "ContrastiveLoss"
bottom: "feat"
bottom: "feat_p"
bottom: "label"
top: "loss"
contrastive_loss_param { margin: 1 }
}
正向直接copy传播,反向时将梯度放缩指定倍。
这个层对一些特定的网络结构有很重要的辅助作用,比如有时我们的网络存在分支,但我们不希望某一分支影响之前层的更新,那么我们就将梯度放缩0倍。
不同功能类型的层所引的头文件也不同,具体大家可以到“caffe/include/caffe/layers”目录下找相似的现成的文件参考 。我们这次写的hpp文件最后也要放在这个目录下。
注意:下面注释包起来的部分为需要注意的部分。
特别注意:命名的时候应严格一致和注意大小写,这一点是导致很多人加层失败的主要原因。
//*****************************************
#ifndef CAFFE_DIFFCUTOFF_LAYER_HPP_
#define CAFFE_DIFFCUTOFF_LAYER_HPP_
//*****************************************
#include
#include "caffe/blob.hpp"
#include "caffe/layer.hpp"
#include "caffe/proto/caffe.pb.h"
//*****************************************
#include "caffe/layers/neuron_layer.hpp"
//*****************************************
namespace caffe {
template
//******以后我们层的type: "DiffCutoff" *******
class DiffCutoffLayer : public NeuronLayer {
//*****************************************
public:
explicit DiffCutoffLayer(const LayerParameter& param) : NeuronLayer(param) {}
virtual void LayerSetUp(const vector*>& bottom, const vector*>&top);
//****我们只需要一个bottom和一个top*****
virtual inline int ExactNumBottomBlobs() const { return 1; }
//******以后我们层的type: "DiffCutoff" *******
virtual inline const char* type() const { return "DiffCutoff"; }
protected:
//******这里只写了CPU功能,故删掉了原本的GPU函数 *******
virtual void Forward_cpu(const vector*>& bottom, const vector*>& top);
virtual void Backward_cpu(const vector*>& top,const vector& propagate_down, const vector*>& bottom);
// *****定义一个Dtype型的标量,用来存储梯度放缩倍数***
Dtype diff_scale;
};
}
#endif
CPP文件应当位于src/caffe/layers下
#include
#include
//*****************************************
#include "caffe/layers/diff_cutoff_layer.hpp"
//*****************************************
#include "caffe/util/math_functions.hpp"
namespace caffe {
template
void DiffCutoffLayer::LayerSetUp(
const vector*>& bottom, const vector*>& top) {
NeuronLayer::LayerSetUp(bottom, top);
// 因为对前向传播不修改,因此top的shape应和bottom的shape相同
top[0]->Reshape(bottom[0]->shape());
}
template
void DiffCutoffLayer::Forward_cpu(
const vector*>& bottom,
const vector*>& top) {
// 前向传播直接将bottom的数据copy到top
const int count = top[0]->count();
caffe_copy(
count,
bottom[0]->cpu_data(),
top[0]->mutable_cpu_data());
}
template
void DiffCutoffLayer::Backward_cpu(const vector*>& top,const vector& propagate_down, const vector*>& bottom) {
const int count = top[0]->count();
const Dtype* top_diff = top[0]->cpu_diff();
//读取我们实际指定的梯度放缩倍数,注意我们的参数名为diff_scale
diff_scale= this->layer_param_.diffcutoff_param().diff_scale();
// 如果bottom前向传播完成,我们就把top的diff放缩后赋给bottom的diff
if (propagate_down[0]) {
Dtype* bottom_diff = bottom[0]->mutable_cpu_diff();
caffe_cpu_axpby(
count,
diff_scale,
top_diff,
Dtype(0),
bottom_diff);
}
}
#ifdef CPU_ONLY
STUB_GPU(DiffCutoffLayer);
#endif
INSTANTIATE_CLASS(DiffCutoffLayer);
REGISTER_LAYER_CLASS(DiffCutoff);
}
这里我们要为我们新写的层添加参数和消息函数。
【1】由于我们的层有一个diff_scale参数,因此我们首先应该在message LayerParameter {}中添加新参数信息。添加信息时,首先要制定一个唯一ID,这个ID的可选值可以由这句话看出:
// NOTE
// Update the next available ID when you add a new LayerParameter field.
//
// LayerParameter next available layer-specific ID: 143 (last added: BatchCLuster)
message LayerParameter {
由上图可以看出,可选的ID为143。
于是我们就可以添加这样一行:
optional DiffCutoffParameter diffcutoff_param = 143;
【2】在任意位置添加消息函数
message DiffCutoffParameter {
optional float diff_scale = 1 [default = 1]; //默认梯度不缩放
}
【3】 在message V1LayerParameter {}中添加以下内容
在enum LayerType {}中添加唯一ID,只要在这里不重复即可。
DIFF_CUTOFF=45;
外面接着添加,同样ID也是只要不重复即可
optional DiffCutoffParameter diffcutoff_param = 46;
【4】 在message V0LayerParameter {}添加参数定义
optional float diff_scale = 47 [default = 1];
1.修改../windows/libcaffe下的两个文件:libcaffe.vcxproj和libcaffe.vcxproj.filters
libcaffe.vcxproj增加:
[plain] view plain copy
libcaffe.vcxproj.filter增加:
[plain] view plain copy
使用方法举例如下:
layer {
name: "diff_1"
type: "DiffCutoff"
bottom: "conv1"
top: "diff_1"
diffcutoff_param {
diff_scale: 0.0001
}
}
(1)一定要注意大小写、一定要注意大小写、一定要注意大小写
(2)不会写、不确定,就去找caffe现有的层来参考模仿
(3)caffe数据操作的函数定义在src/caffe/util/math_functions.cpp,
大家也可以参考这位同学的博客
http://blog.csdn.net/seven_first/article/details/47378697