caffe对自定义数据集进行分类

1.准备数据集:

           /examples/imagenet/新建myimage文件夹存放数据集

         myimage,train(bus,dinosaur,elephant,flower,horse)五个类别的训练集,每类80张,val(......)五个类别的测试集,每类20张 制作标签:其中matlab代码:

            

  pos_folder='/home/handson/caffe/examples/imagenet/myimage/val/';
  pos = dir(pos_folder);
  fid=fopen('val.txt','w'); 
  for i=3:7
           img_name = [pos(i).name];
           img_folder=[pos_folder,'/',pos(i).name];
            images=dir(img_folder);
        
       for j=3:size(images,1)
           d=strcat(img_name,'/',images(j).name);
           fprintf(fid,'%s %d\n',d,i-3);//这里%s和%d之间只有一个空格,不然下面生成数据库有误
       end
     end
     fclose(fid);

生成结果:放入myimage文件夹下value.txt,val.txt bus/395.jpg 0 bus/396.jpg 0 bus/397.jpg 0 bus/398.jpg 0 bus/399.jpg 0 dinosaur/420.jpg 1 dinosaur/421.jpg 1 dinosaur/422.jpg 1 dinosaur/423.jpg 1

....

2.转换为lmdb格式

运行脚本代码create_imagenet.sh: #!/usr/bin/env sh # Create the imagenet lmdb inputs # N.B. set the path to the imagenet train + val data dirs set -e EXAMPLE=examples/imagenet/myimage DATA=examples/imagenet/myimage TOOLS=build/tools TRAIN_DATA_ROOT=examples/imagenet/myimage/train/ VAL_DATA_ROOT=examples/imagenet/myimage/val/ # Set RESIZE=true to resize the images to 256x256. Leave as false if images have # already been resized using another tool. RESIZE=true if $RESIZE; then RESIZE_HEIGHT=256 RESIZE_WIDTH=256 else RESIZE_HEIGHT=0 RESIZE_WIDTH=0 fi if [ ! -d "$TRAIN_DATA_ROOT" ]; then echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT" echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to the path" \ "where the ImageNet training data is stored." exit 1 fi if [ ! -d "$VAL_DATA_ROOT" ]; then echo "Error: VAL_DATA_ROOT is not a path to a directory: $VAL_DATA_ROOT" echo "Set the VAL_DATA_ROOT variable in create_imagenet.sh to the path" \ "where the ImageNet validation data is stored." exit 1 fi echo "Creating train lmdb..." GLOG_logtostderr=1 $TOOLS/convert_imageset \ --resize_height=$RESIZE_HEIGHT \ --resize_width=$RESIZE_WIDTH \ --shuffle \ $TRAIN_DATA_ROOT \ $EXAMPLE/train.txt \ $EXAMPLE/train_lmdb echo "Creating val lmdb..." GLOG_logtostderr=1 $TOOLS/convert_imageset \ --resize_height=$RESIZE_HEIGHT \ --resize_width=$RESIZE_WIDTH \ --shuffle \ $VAL_DATA_ROOT \ $EXAMPLE/val.txt \ $EXAMPLE/val_lmdb echo "Done."


结果会在myimage下生成train_lmdb和val_lmdb两个文件夹



3.计算均值:

运行脚本代码:make_imagenet_mean.sh #!/usr/bin/env sh # Compute the mean image from the imagenet training lmdb # N.B. this is available in data/ilsvrc12 EXAMPLE=examples/imagenet/myimage DATA=examples/imagenet/myimage TOOLS=build/tools $TOOLS/compute_image_mean $EXAMPLE/train_lmdb \ $DATA/imagenet_mean.binaryproto echo "Done." 注意:这里只要计算训练集的均值就行,测试数据集用的均值就是训练集计算的均值
结果:在myimage下生成imagenet_mean.binaryproto
.

4.训练网络:用的是AlexNet

修改配置文件:solver.prototxt: net: "examples/imagenet/myimage/train_val.prototxt" test_iter: 10//测试数据集一共100张,batchsize=10,所以迭代10次就可以全部覆盖测试集 test_interval: 100 base_lr: 0.001 lr_policy: "step" gamma: 0.1 stepsize: 100 display: 20 max_iter: 400 momentum: 0.9 weight_decay: 0.0005 snapshot: 200 snapshot_prefix: "examples/imagenet/myimage" solver_mode: CPU
修改网络:train_val.prototxt:

 name: "AlexNet" 
  
layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    mirror: true
    crop_size: 227
    mean_file: "examples/imagenet/myimage/imagenet_mean.binaryproto"
  }
  data_param {
    source: "examples/imagenet/myimage/train_lmdb"
    batch_size: 50
    backend: LMDB
  }
}
layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    mirror: false
    crop_size: 227
    mean_file: "examples/imagenet/myimage/imagenet_mean.binaryproto"
  }
  data_param {
    source: "examples/imagenet/myimage/val_lmdb"
    batch_size: 10
    backend: LMDB
  }
}
最后将fc8中的numoutput=5;

5.在根目录下运行脚本训练:train_caffenet.sh:

#!/usr/bin/env sh
set -e

./build/tools/caffe train \
    --solver=examples/imagenet/myimage/solver.prototxt $@


6.测试:

.build/tools/caffe.bin test -model examples/..train_val.prototxt -weights examples/..(训练得到的caffemodel) -iteraions 10

7.精度大约90%

你可能感兴趣的:(caffe学习)