【FCN实践】02 模型迁移及初始化

Fully Convolutional Networks for Semantic Segmentation

——————————————————————————————————————————
【FCN实践】01 常见问题 http://blog.csdn.net/binlearning/article/details/72854136
【FCN实践】02 模型迁移及初始化 http://blog.csdn.net/binlearning/article/details/72854244
【FCN实践】03 训练 http://blog.csdn.net/binlearning/article/details/72854407
【FCN实践】04 预测 http://blog.csdn.net/binlearning/article/details/72854583
【项目源码】https://github.com/binLearning/fcn_voc_32s
——————————————————————————————————————————

transplant函数用于已训练模型参数迁移;
interp函数用于上采样层参数初始化。

1.transplant
在surgery.py文件中实现

def transplant(new_net, net, suffix=''):
    """
    Transfer weights by copying matching parameters, coercing parameters of
    incompatible shape, and dropping unmatched parameters.

    The coercion is useful to convert fully connected layers to their
    equivalent convolutional layers, since the weights are the same and only
    the shapes are different.  In particular, equivalent fully connected and
    convolution layers have shapes O x I and O x I x H x W respectively for O
    outputs channels, I input channels, H kernel height, and W kernel width.

    Both  `net` to `new_net` arguments must be instantiated `caffe.Net`s.
    """
    for p in net.params:
        p_new = p + suffix
        if p_new not in new_net.params:
            print 'dropping', p
            continue
        for i in range(len(net.params[p])):
            if i > (len(new_net.params[p_new]) - 1):
                print 'dropping', p, i
                break
            if net.params[p][i].data.shape != new_net.params[p_new][i].data.shape:
                print 'coercing', p, i, 'from', net.params[p][i].data.shape,  'to', new_net.params[p_new][i].data.shape
            else:
                print 'copying', p, ' -> ', p_new, i
            new_net.params[p_new][i].data.flat = net.params[p][i].data.flat

①新旧网络中同名且shape相匹配的层,直接将旧网络中对应的参数复制到新网络中;
②新旧网络中同名但shape不匹配的层,将旧网络中的参数强制拷贝到新网络中;
③新网络中不存在的层直接忽略。
注:其实①②操作都是将参数用一维迭代器(numpy.ndarray.flat)逐一复制。

2.interp
在solve.py文件中调用

# surgeries
interp_layers = [k for k in solver.net.params.keys() if 'up' in k]
surgery.interp(solver.net, interp_layers)

在surgery.py文件中实现

def upsample_filt(size):
    """
    Make a 2D bilinear kernel suitable for upsampling of the given (h, w) size.
    """
    factor = (size + 1) // 2
    if size % 2 == 1:
        center = factor - 1
    else:
        center = factor - 0.5
    og = np.ogrid[:size, :size]
    return (1 - abs(og[0] - center) / factor) * \
           (1 - abs(og[1] - center) / factor)

def interp(net, layers):
    """
    Set weights of each layer in layers to bilinear kernels for interpolation.
    """
    for l in layers:
        m, k, h, w = net.params[l][0].data.shape
        if m != k and k != 1:
            print 'input + output channels need to be the same or |output| == 1'
            raise
        if h != w:
            print 'filters need to be square'
            raise
        filt = upsample_filt(h)
        net.params[l][0].data[range(m), range(k), :, :] = filt

将上采样层中的核函数初始化为实现双线性插值操作。

经过上述两个步骤的处理,就可以得到voc-fcn32s的初始模型。具体程序如下:

3.模型迁移及初始化

'''
Transfer weights by copying matching parameters, coercing parameters of incompatible shape,
initialize weights of Deconv layer to bilinear kernels for interpolation.
'''

from __future__ import absolute_import
from __future__ import print_function

import os
import sys

import caffe

from surgery import transplant,interp


def main():
  rt_dir = './model'

  model_src   = os.path.join(rt_dir, sys.argv[1])
  weights_src = os.path.join(rt_dir, sys.argv[2])
  model_dst   = os.path.join(rt_dir, sys.argv[3])
  weights_dst = os.path.join(rt_dir, sys.argv[4])

  caffe.set_mode_cpu()

  net_src = caffe.Net(model_src, weights_src, caffe.TRAIN)
  net_dst = caffe.Net(model_dst, caffe.TRAIN)

  # net architecture
  print('======== source network architecture ========')
  for layer_name, blob in net_src.blobs.iteritems():
    print(layer_name + '\t' + str(blob.data.shape))
  print('====== destination network architecture =====')
  for layer_name, blob in net_dst.blobs.iteritems():
    print(layer_name + '\t' + str(blob.data.shape))

  # net parameters
  print('========= source network parameters =========')
  for layer_name, param in  net_src.params.iteritems():
    print(layer_name + '\t' + str(param[0].data.shape))# , str(param[1].data.shape)
  print('======= destination network parameters ======')
  for layer_name, param in  net_dst.params.iteritems():
    print(layer_name + '\t' + str(param[0].data.shape))# , str(param[1].data.shape)

  # transfer
  # copy parameters source net => destination net
  print('================= transfer ==================')
  transplant(net_dst, net_src)

  # initialize
  # use bilinear kernels to initialize Deconvolution layer
  print('================ initialize =================')
  interp_layers = [k for k in net_dst.params.keys() if 'up' in k]
  for k in interp_layers:
    print(k)
  interp(net_dst, interp_layers)

  print('=============================================')

  # save new weights
  net_dst.save(weights_dst)


if __name__ == '__main__':
  main()

你可能感兴趣的:(语义分割,FCN)