基于深度学习的图像匹配技术专题- [patch based matching4]-手把手教你实现MatchNet

上一篇博文(基于深度学习的图像匹配技术专题- [patch based matching3]-mnist siamese 学习)已经完整的介绍了caffe mnist siamese模型,并训练成功,相信大家已经有点感觉了。现在我们的目标回到MatchNet,虽然作者没有提供源码,我们也能自己实现出来,加油。

当然,我们还有很多问题没有介绍,如制备数据集合等内容。我们选择边实现模型边解决这些实际问题,抓住主干,各个击破。

1.MatchNet的网络结构实现

在本专题第一讲中,我们已经知道了MatchNet的网络结构,下面我们用caffe实现出来。这里有两个.prototxt文件:一个是模型,一个是solver。其中solver的参数该如何设置,可能是更麻烦的问题。

好了,我们首先是写train.prototxt文件。

caffe基础知识

Blob:是基础的数据结构,是用来保存学习到的参数以及网络传输过程中产生数据的类。
Layer:是网络的基本单元,由此派生出了各种层类。修改这部分的人主要是研究特征表达方向的。
Net:是网络的搭建,将Layer所派生出层类组合成网络。 Solver:是Net的求解,修改这部分人主要会是研究DL求解方向的。


layer {
  name: "pair_data"
  type: "Data"
  top: "pair_data"
  top: "sim"
  include {
    phase: TRAIN
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "examples/siamese/mnist_siamese_train_leveldb"
    batch_size: 64
  }
}
c


定义一个层需要几种信息,首先train.prototxt文件。

name: "mnist_matchnet_train_test"
##数据输入
layer {
  name: "pair_data"
  type: "Data"
  top: "pair_data"
  top: "sim"
  include {
    phase: TRAIN
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "examples/siamese/mnist_siamese_train_leveldb"
    batch_size: 64
  }
}
layer {
  name: "pair_data"
  type: "Data"
  top: "pair_data"
  top: "sim"
  include {
    phase: TEST
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "examples/siamese/mnist_siamese_test_leveldb"
    batch_size: 100
  }
}

layer {
  name: "slice_pair"
  type: "Slice"
  bottom: "pair_data"
  top: "data"
  top: "data_p"
  slice_param {
    slice_dim: 1
    slice_point: 1
  }
}
###########卷积层########
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {    ##权重和偏置
    name: "conv1_w"
    lr_mult: 1
  }
  param {
    name: "conv1_b"
    lr_mult: 2
  }
  convolution_param { ## 输出通道,kernel,stride
    num_output: 20
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    name: "conv2_w"
    lr_mult: 1
  }
  param {
    name: "conv2_b"
    lr_mult: 2
  }
  convolution_param {
    num_output: 50
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}

layer {
  name: "conv3"
  type: "Convolution"
  bottom: "pool2"
  top: "conv3"
  param {
    name: "conv3_w"
    lr_mult: 1
  }
  param {
    name: "conv3_b"
    lr_mult: 2
  }
  convolution_param {
    num_output: 64
    kernel_size: 3
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}

layer {
  name: "conv4"
  type: "Convolution"
  bottom: "conv3"
  top: "conv4"
  param {
    name: "conv4_w"
    lr_mult: 1
  }
  param {
    name: "conv4_b"
    lr_mult: 2
  }
  convolution_param {
    num_output: 96
    kernel_size: 3
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}

layer {
  name: "conv5"
  type: "Convolution"
  bottom: "conv4"
  top: "conv5"
  param {
    name: "conv5_w"
    lr_mult: 1
  }
  param {
    name: "conv5_b"
    lr_mult: 2
  }
  convolution_param {
    num_output: 64
    kernel_size: 3
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool3"
  type: "Pooling"
  bottom: "conv5"
  top: "pool3"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}

 layer { name: "ip1" type: "InnerProduct" bottom: "pool3" top: "ip1" param { name: "ip1_w" lr_mult: 1 } param { name: "ip1_b" lr_mult: 2 } inner_product_param { num_output: 500 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "relu1" type: "ReLU" bottom: "ip1" top: "ip1"}layer { name: "ip2" type: "InnerProduct" bottom: "ip1" top: "ip2" param { name: "ip2_w" lr_mult: 1 } param { name: "ip2_b" lr_mult: 2 } inner_product_param { num_output: 10 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "feat" type: "InnerProduct" bottom: "ip2" top: "feat" param { name: "feat_w" lr_mult: 1 } param { name: "feat_b" lr_mult: 2 } inner_product_param { num_output: 2 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}
#######################second 分支######################################
 layer { name: "conv1_p" type: "Convolution" bottom: "data_p" top: "conv1_p" param { name: "conv1_w" lr_mult: 1 } param { name: "conv1_b" lr_mult: 2 } convolution_param { num_output: 20 kernel_size: 5 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "pool1_p" type: "Pooling" bottom: "conv1_p" top: "pool1_p" pooling_param { pool: MAX kernel_size: 2 stride: 2 }}layer { name: "conv2_p" type: "Convolution" bottom: "pool1_p" top: "conv2_p" param { name: "conv2_w" lr_mult: 1 } param { name: "conv2_b" lr_mult: 2 } convolution_param { num_output: 50 kernel_size: 5 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "pool2_p" type: "Pooling" bottom: "conv2_p" top: "pool2_p" pooling_param { pool: MAX kernel_size: 2 stride: 2 }}



layer {
  name: "conv3_p"
  type: "Convolution"
  bottom: "pool2_p"
  top: "conv3_p"
  param {
    name: "conv3_w"
    lr_mult: 1
  }
  param {
    name: "conv3_b"
    lr_mult: 2
  }
  convolution_param {
    num_output: 64
    kernel_size: 3
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}

layer {
  name: "conv4_p"
  type: "Convolution"
  bottom: "conv3_p"
  top: "conv4_p"
  param {
    name: "conv4_w"
    lr_mult: 1
  }
  param {
    name: "conv4_b"
    lr_mult: 2
  }
  convolution_param {
    num_output: 96
    kernel_size: 3
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}

layer {
  name: "conv5_p"
  type: "Convolution"
  bottom: "conv4_p"
  top: "conv5_p"
  param {
    name: "conv5_w"
    lr_mult: 1
  }
  param {
    name: "conv5_b"
    lr_mult: 2
  }
  convolution_param {
    num_output: 64
    kernel_size: 3
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool3_p"
  type: "Pooling"
  bottom: "conv5_p"
  top: "pool3_p"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
  layer { name: "ip1_p" type: "InnerProduct" bottom: "pool3_p" top: "ip1_p" param { name: "ip1_w" lr_mult: 1 } param { name: "ip1_b" lr_mult: 2 } inner_product_param { num_output: 500 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "relu1_p" type: "ReLU" bottom: "ip1_p" top: "ip1_p"}layer { name: "ip2_p" type: "InnerProduct" bottom: "ip1_p" top: "ip2_p" param { name: "ip2_w" lr_mult: 1 } param { name: "ip2_b" lr_mult: 2 } inner_product_param { num_output: 10 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "feat_p" type: "InnerProduct" bottom: "ip2_p" top: "feat_p" param { name: "feat_w" lr_mult: 1 } param { name: "feat_b" lr_mult: 2 } inner_product_param { num_output: 2 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "loss" type: "ContrastiveLoss" bottom: "feat" bottom: "feat_p" bottom: "sim" top: "loss" contrastive_loss_param { margin: 1 }} 
  
 
  

2. solver参数设置

前面部分描述了使用matchnet网络来处理mnist siamese数据集,我们在原来网络的基础上增加了 3个卷积层和一个max pooling, 下面我们来加入他们的参数。

# The train/test net protocol buffer definition
net: "examples/siamese/mnist_matchnet_train_test.prototxt"
# test_iter specifies how many forward passes the test should carry out.
# In the case of MNIST, we have test batch size 100 and 100 test iterations,
# covering the full 10,000 testing images.
test_iter: 100
# Carry out testing every 500 training iterations.
test_interval: 500
# The base learning rate, momentum and the weight decay of the network.
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0000
# The learning rate policy
lr_policy: "inv"
gamma: 0.0001
power: 0.75
# Display every 100 iterations
display: 100
# The maximum number of iterations
max_iter: 50000
# snapshot intermediate results
snapshot: 5000
snapshot_prefix: "examples/siamese/mnist_siamese"
# solver mode: CPU or GPU
solver_mode: GPU

最后再写一个 脚本文件

#!/usr/bin/env sh
set -e

TOOLS=./build/tools

$TOOLS/caffe train --solver=examples/siamese/mnist_match_solver.prototxt $@

基于深度学习的图像匹配技术专题- [patch based matching4]-手把手教你实现MatchNet_第1张图片

可以看到这个网络结构,可以达到更高的准确度。网络训练过程中,可能过拟合了,因为训练中的Loss很小,测试的效果没有那么理想。

这个网络没有改变Loss function,而是保留了siamese 中的对应部分,这是接下来要做的工作。

——————————————————————————

你的支持,将会帮助我更新博客,提供更多有别于其他博客的博文。

基于深度学习的图像匹配技术专题- [patch based matching4]-手把手教你实现MatchNet_第2张图片



你可能感兴趣的:(深度学习,matchNet)