1.首先,caffe的安装很麻烦,稍后有时间我在详细写一个教程。
先贴个官网的安装方法,http://caffe.berkeleyvision.org/installation.html
2.安装好之后,仔细阅读并照着流程跑一下官网给的例子,链接如下:
1).http://caffe.berkeleyvision.org/gathered/examples/mnist.html
2).http://caffe.berkeleyvision.org/gathered/examples/cifar10.html
……
3.看完之后,可以仔细研究以下通过python来使用caffe的例子,了解使用caffe的方法。
1). http://nbviewer.jupyter.org/github/BVLC/caffe/blob/master/examples/01-learning-lenet.ipynb
2). http://nbviewer.jupyter.org/github/BVLC/caffe/blob/master/examples/00-classification.ipynb
3).http://www.cnblogs.com/empty16/p/4878164.html
……
4.以下以人脸识别问题使用以下库使用caffe进行训练和测试:
http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html
里面包括了40个人,每人10张人脸照片。如下图:
由于官网上给出了Model_Zoo的链接,通过查询得知,已经有训练好的人脸识别模型,可以直接拿来使用,即:
下载地址:
http://www.robots.ox.ac.uk/%7Evgg/software/vgg_face/src/vgg_face_caffe.tar.gz
在网站VGG Face Descriptor中提供了模型和源码,具体使用参考相关说明即可,基本的流程应该比较简单:
- 在脚本源码中指定Caffe库的路径,指定.caffemodel模型,指定输入数据,通过函数调用网络的测试功能,获取网络输出结果。
- 执行脚本源码。
如果源码的使用说明不能够充分理解,可以参考Jupyter Notebook Viewer的示例。基本流程与ImageNet的分类任务应该是相同的。另外,模型的数据集在VGG Face Descriptor相关论文的第三章有说明。pdf
其次因为人脸图片是灰度图,需要首先用OpenCV将其转化成RGB的图片才能使用VGG。python代码如下:
import os
import cv2
import sysdef
convert_gray_img_to_rgb(base_dir,dir_pre_str,dir_range_list,dir_post_str,file_format,partion_list):
for i in dir_range_list:
for index,partion_list_part in enumerate(partion_list):
for k in partion_list_part:
if base_dir=="":
base_dir_str=""
else:
base_dir_str=base_dir+os.sep
type=""
if index==0:
type="train"
elif index==1:
type="tst"
file_input_path=base_dir_str+type+os.sep+dir_pre_str+str(i)+\
dir_post_str+os.sep+str(k)+file_format
img = cv2.imread( file_input_path,0 )
img = cv2.cvtColor( img, cv2.COLOR_GRAY2RGB )
out_file= base_dir_str+type+os.sep+dir_pre_str+\
str(i)+dir_post_str+os.sep+str(k)+".jpg"
cv2.imwrite(out_file, img)
if __name__=='__main__':
source_dir="/Users/Ren/Downloads/att_faces_back"
dir_pre_str="s"
dir_range_list=range(1,41)
test_partion_list=[7,8,9,10]
train_partion_list=[1,2,3,4,5,6]
dir_post_str=""
file_format=".pgm"
convert_gray_img_to_rgb(source_dir,dir_pre_str,dir_range_list\
,dir_post_str,file_format,[train_partion_list,test_partion_list])
对于此数据库,首先需要将人脸的数据进行划分:训练和测试集,并转换成lmdb模型。过程请参考:http://www.cnblogs.com/dupuleng/articles/4370236.html。我的代码如下,将其保存到了example/att_faces/create_att_faces.sh
#!/usr/bin/env sh
# Create the imagenet lmdb inputs
# N.B. set the path to the imagenet train + val data dirs
EXAMPLE=examples/att_faces
DATA=data/att_faces
TOOLS=build/tools
DBTYPE=lmdb
TRAIN_DATA_ROOT=$DATA/train/
TEST_DATA_ROOT=$DATA/tst/
ROOT=./
# Set RESIZE=true to resize the images to 256x256. Leave as false if images have
# already been resized using another tool.
RESIZE=true
if $RESIZE; then
RESIZE_HEIGHT=224
RESIZE_WIDTH=224
else
RESIZE_HEIGHT=0
RESIZE_WIDTH=0
fi
if [ ! -d "$TRAIN_DATA_ROOT" ]; then
echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"
echo "Set the TRAIN_DATA_ROOT variable in create_att_faces.sh to the path" \
"where the ImageNet training data is stored."
exit 1
fi
if [ ! -d "$TEST_DATA_ROOT" ]; then
echo "Error: TEST_DATA_ROOT is not a path to a directory: $TEST_DATA_ROOT"
echo "Set the TEST_DATA_ROOT variable in create_att_faces.sh to the path" \
"where the ImageNet test data is stored."
exit 1
fi
echo "Creating train lmdb..."
rm -rf $EXAMPLE/att_faces_train_$DBTYPE $EXAMPLE/att_faces_tst_$DBTYPE
GLOG_logtostderr=1 $TOOLS/convert_imageset \
--resize_height=$RESIZE_HEIGHT \
--resize_width=$RESIZE_WIDTH \
--shuffle \
$ROOT \
$DATA/train.txt \
$EXAMPLE/att_faces_train_$DBTYPE
echo "Creating tst lmdb..."
rm -f $EXAMPLE/mean.binaryproto
GLOG_logtostderr=1 $TOOLS/convert_imageset \
--resize_height=$RESIZE_HEIGHT \
--resize_width=$RESIZE_WIDTH \
--shuffle \
$ROOT \
$DATA/tst.txt \
$EXAMPLE/att_faces_tst_$DBTYPE
echo "Computing image mean..."
./build/tools/compute_image_mean -backend=$DBTYPE \
$EXAMPLE/att_faces_train_$DBTYPE $EXAMPLE/mean.binaryproto
echo "Done."
之后可以使用该数据通过以models/finetune_flickr_style/train_val.prototxt 为模板,以vgg_face_caffe/VGG_FACE_deploy.prototxt 为内容将网络结构进行填充。即加入数据输入层与改变最后一层的全连接层输出数量,修正掉旧caffe的语法。修正后的内容如下:
name: "VGG_FACE_16_Net"
layer {
name: "data"
type: "ImageData"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: true
crop_size: 224
mean_file: "examples/att_faces/mean.binaryproto"
}
image_data_param {
source: "data/att_faces/train.txt"
batch_size: 1
new_height: 224
new_width: 224
}
}
layer {
name: "data"
type: "ImageData"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
mirror: false
crop_size: 224
mean_file: "examples/att_faces/mean.binaryproto"
}
image_data_param {
source: "data/att_faces/tst.txt"
batch_size: 1
new_height: 224
new_width: 224
}
}
layer {
name: "conv1_1"
type: "Convolution"
bottom: "data"
top: "conv1_1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1_1"
type: "ReLU"
bottom: "conv1_1"
top: "conv1_1"
}
layer {
name: "conv1_2"
type: "Convolution"
bottom: "conv1_1"
top: "conv1_2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1_2"
type: "ReLU"
bottom: "conv1_2"
top: "conv1_2"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1_2"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv2_1"
type: "Convolution"
bottom: "pool1"
top: "conv2_1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu2_1"
type: "ReLU"
bottom: "conv2_1"
top: "conv2_1"
}
layer {
name: "conv2_2"
type: "Convolution"
bottom: "conv2_1"
top: "conv2_2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu2_2"
type: "ReLU"
bottom: "conv2_2"
top: "conv2_2"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2_2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv3_1"
type: "Convolution"
bottom: "pool2"
top: "conv3_1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3_1"
type: "ReLU"
bottom: "conv3_1"
top: "conv3_1"
}
layer {
name: "conv3_2"
type: "Convolution"
bottom: "conv3_1"
top: "conv3_2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3_2"
type: "ReLU"
bottom: "conv3_2"
top: "conv3_2"
}
layer {
name: "conv3_3"
type: "Convolution"
bottom: "conv3_2"
top: "conv3_3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3_3"
type: "ReLU"
bottom: "conv3_3"
top: "conv3_3"
}
layer {
name: "pool3"
type: "Pooling"
bottom: "conv3_3"
top: "pool3"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv4_1"
type: "Convolution"
bottom: "pool3"
top: "conv4_1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 512
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu4_1"
type: "ReLU"
bottom: "conv4_1"
top: "conv4_1"
}
layer {
name: "conv4_2"
type: "Convolution"
bottom: "conv4_1"
top: "conv4_2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 512
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu4_2"
type: "ReLU"
bottom: "conv4_2"
top: "conv4_2"
}
layer {
name: "conv4_3"
type: "Convolution"
bottom: "conv4_2"
top: "conv4_3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 512
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu4_3"
type: "ReLU"
bottom: "conv4_3"
top: "conv4_3"
}
layer {
name: "pool4"
type: "Pooling"
bottom: "conv4_3"
top: "pool4"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv5_1"
type: "Convolution"
bottom: "pool4"
top: "conv5_1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 512
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu5_1"
type: "ReLU"
bottom: "conv5_1"
top: "conv5_1"
}
layer {
name: "conv5_2"
type: "Convolution"
bottom: "conv5_1"
top: "conv5_2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 512
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu5_2"
type: "ReLU"
bottom: "conv5_2"
top: "conv5_2"
}
layer {
name: "conv5_3"
type: "Convolution"
bottom: "conv5_2"
top: "conv5_3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 512
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu5_3"
type: "ReLU"
bottom: "conv5_3"
top: "conv5_3"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5_3"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "fc6"
type: "InnerProduct"
bottom: "pool5"
top: "fc6"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc6"
top: "fc6"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc6"
top: "fc6"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc7"
type: "InnerProduct"
bottom: "fc6"
top: "fc7"
# Note that lr_mult can be set to 0 to disable any fine-tuning of this, and any other, layer
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu7"
type: "ReLU"
bottom: "fc7"
top: "fc7"
}
layer {
name: "drop7"
type: "Dropout"
bottom: "fc7"
top: "fc7"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc8_flickr"
type: "InnerProduct"
bottom: "fc7"
top: "fc8_flickr"
# lr_mult is set to higher than for other layers, because this layer is starting from random while the others are already trained
propagate_down: false
inner_product_param {
num_output: 40
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "fc8_flickr"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "fc8_flickr"
bottom: "label"
top: "loss"
}
拷贝models/finetune_flickr_style/solver.prototxt,并将新的针对现问题进行修改,主要修改
net: "models/finetune/train_val.prototxt"
test_iter: 100
test_interval: 100
# lr for fine-tuning should be lower than when starting from scratch
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
# stepsize should also be lower, as we're closer to being done
stepsize: 2000
display: 20
max_iter: 10000
momentum: 0.9
weight_decay: 0.0005
snapshot: 1000
snapshot_prefix: "models/finetune/finetune"
# uncomment the following to default to CPU mode solving
#solver_mode: CPU
最后使用自己的数据对模型进行fine-tuning。代码如下:
./build/tools/caffe train -solver models/finetune/solver.prototxt -weights models/vgg_face_caffe/VGG_FACE.caffemodel -gpu 0