当我们的网络需要多个输入数据,或者标签维度不为1时,这对于caffe提供的数据层来说是难以实现的。
修改C++源码更使实现算法的效率大打折扣。用Python接口来实现特殊的层对于用户来说是十分高效的选择。那么如何定制自己的Python层呢?
layer{
name:"data" #层名称
type:"Python" #层类别
top:"data" #层第一个输出
top:"label" #层第二个输出
python_param{ #Python层参数
module:"datalayer" #module名,一般指实现方法的文件名(文件所在目录要包含在PYTHONPATH中,一遍Python编译器找到该module)
layer: "DataLayer" #实现方法的类名(包含在上述module里面)
param_str:"{'source':'/xxx.list','im_shape':(128,128),'batch_size':32}" #Python层需要传入的参数,以字典形式传入(传入参数名与参数形式必须与实现的方法要求一致)
}
include{phase:TRAIN}
}
写好Python层之后呢,就开始动手码:
在datalayer.py中码入如下代码:
class DataLayer(caffe.Layer): #继承自caffe.Layer,基本层类
"""
This is a datalayer for training a cnn model.
"""
def setup(self, bottom, top):
self.top_names = ['data', 'label']
#=== Read input parameters ===
# params is a python dictionary with layer parameters. 读取传入的参数字典
params = eval(self.param_str)
#Store input as class variables.
self.batch_size = params['batch_size']
#Create a batch loader to load the iamges. 定义一个BatchLoader实例来读取数据
self.batch_loader = BatchLoader(params, None)
# === reshape tops ===
# since we use a fixed input image size, we can shape the data layer
# the shape of top blobs is NxCxHxW.
top[0].reshape( self.batch_size, 1, params['im_shape'][0],
params['im_shape'][1])
top[1].reshape( self.batch_size, 1, params['im_shape'][0],
params['im_shape'][1])
def forward(self, bottom, top):
"""
Load data.
"""
for itt in range(self.batch_size):
# Use the batch loader to load the next image.
im, label = self.batch_loader.load_next_image()
# Add directly to the cafe data layer 直接讲数据写入top blob中
top[0].data[itt, 0, ...] = im
top[1].data[itt, 0, ...] = label
def reshape(self, bottom, top): #如需要,写出相应实现
pass
def backward(self, bottom, top): #如果定制的层需要后向传播,在这写出实现
pass
class BatchLoader(object):
"""
This class abstracts away the loading of images.
Images can either be loaded singly, or in a batch. The latter is used for
the asyncronous data layer to preload batches while other processing is
performed.
"""
def __init__(self, params, result):
self.result = result
self.batch_size = params['batch_size']
self.source = params['source']
self.im_shape = params['im_shape']
# get list of image indexes.
self.indexlist = [line.rstrip('\n') for line in open(
self.source)]
# Current image
self._cur = 0
# This class does some simple data-manipulations
self.transformer = SimpleTransformer()
print "BatchLoader initialized with {} images.".format(
len(self.indexlist))
def load_next_image(self):
"""
Load the next image in a batch.
"""
if self._cur == len(self.indexlist):
self._cur = 0
npr.shuffle(self.indexlist) #np.random.shuffle
# Load an image
image_file_name = self.indexlist[self._cur].split(' ')[0]
im = np.asarray(cv2.imread(image_file_name, 0), dtype=np.float32)
im = np.reshape(im, self.im_shape)
# Load a label image
label_file_name = self.indexlist[self._cur].split(' ')[1]
label = np.asarray(cv2.imread(image_file_name, 0), dtype=np.float32)
label = np.reshape(label, self.im_shape)
self._cur += 1
return im, label
在上述类中,用户可以根据自己的需求来实现。然后就是训练:
import caffe
import numpy as np
if __name__ == '__main__':
caffe.set_mode_gpu()
caffe.set_device(0)
solver_file = 'solver_v2.prototxt'
max_iter = 1000000
solver = caffe.get_solver(solver_file)
#train the model directly according the solver
#solver.solve()
net = solver.net
#train the model by one by one step
for i in range(0, max_iter):
solver.step(1)
这是一个简单的例子,网络结构已经在solver_file中定义好了,也可以用Python直接定义网络结构。
from caffe import layers as L
#e.g, define a python data layer mentioned above
n = net.NetSpec()
n.data, n.label = L.Python( python_param = dict(
module:'datalayer', layer:'DataLayer',
param_str:"{'source':'xxx', 'im_shape':(xxx,xxx),
'batch_size':xxx)}"),ntop = 2) #ntop is the number of the top blobs
n.acc = L.EuclideanLoss(n.data, n.label)
proto = n.to_proto()
with open('train.prototxt','w+') as f:
f.write(str(proto))
上述代码就给出了利用Python接口写caffe网络结构的例子。
本博文参考caffe官方提供的pascalmultilabeldatalayer,详情请看这里。
2017-3-30记