Caffe学习笔记1:用训练好的mnist模型进行预测(两种方法)

caffe的mnist训练起来挺方便,但是怎么进行预测呢?

参考:

http://blog.csdn.net/l691899397/article/details/52233454
http://www.jianshu.com/p/9644f7ec0a03
http://www.jianshu.com/p/9e30328a0a71

理论可以参考一下第一个博客,也可以看看论文。

我认为进行预测两个关键的步骤是:1.加载训练好的caffemodel和模型描述文件deploy 2.是把要预测的图片正确的导入,这需要理解caffe的Blob。下面进入正题:

一.建立模型描述文件

读了caffe的整体框架(本新只看了整体框架,对于源代码看的不深入)了解caffe极度模块化,便于阅读和理解,对于整个神经网络的编写都快接近可视化了。我的代码在我的github上下载

https://github.com/zefan7564/caffe

下面是我进行预测的网络层。

name: "LeNet"
layer {
  name: "data"
  type: "Input"
  top: "data"
  input_param { shape: { dim: 1 dim: 1 dim: 28 dim: 28 } }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 20
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 50
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool2"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "prob"
  type: "Softmax"
  bottom: "ip2"
  top: "prob"
}

值得注意的是

layer {
  name: "data"
  type: "Input"
  top: "data"
  input_param { shape: { dim: 1 dim: 1 dim: 28 dim: 28 } }
}

因为caffe的数据格式是四维数组,[num,channel,weight,high],因此这样设计数据层。接着是最后一个层prob,与训练时不同。它只需要前向传播,输出数据。因此不需要loss层和accuracy层。
然后是图像预处理减去均值,这样在训练和预测都能提高速度。
然后是训练好的模型,这里就使用lenet_iter_10000.caffemodel
二.使用训练好的模型进行预测,这里我直接贴出两个代码

#coding=utf-8
#caffe and opencv test mnist
#test by yuzefan
import os
import caffe
import numpy as np 
import cv2
import sys
caffe_root='/home/ubuntu/caffe-master/'
sys.path.insert(0,caffe_root+'python') #add this python path
os.chdir(caffe_root)
MODEL_FILE=caffe_root+'mytest/my-mnist/classificat_net.prototxt'
WEIGTHS=caffe_root+'mytest/my-mnist/lenet_iter_10000.caffemodel'
net=caffe.Classifier(MODEL_FILE,WEIGTHS)
caffe.set_mode_gpu()
IMAGE_PATH=caffe_root+'mytest/smy-mnist/'
font = cv2.FONT_HERSHEY_SIMPLEX #normal size sans-serif font
for i in range(0,9):
  # astype() is a method provided by numpy to convert numpy dtype.
  input_image=cv2.imread(IMAGE_PATH+'{}.png'.format(i),cv2.IMREAD_GRAYSCALE).astype(np.float32)
  #resize Image to improve vision effect.
  resized=cv2.resize(input_image,(280,280),None,0,0,cv2.INTER_AREA)
  input_image = input_image[:, :, np.newaxis] # input_image.shape is (28, 28, 1), with dtype float32
  prediction = net.predict([input_image], oversample=False)
  cv2.putText(resized, str(prediction[0].argmax()), (200, 280), font, 4, (255,), 2)
  cv2.imshow("Prediction", resized)
  print 'predicted class:', prediction[0].argmax()
  keycode = cv2.waitKey(0) & 0xFF
  if keycode == 27:
    break
#coding=utf-8
#caffe and opencv test mnist
#test by yuzefan
import os
import caffe
import numpy as np 
import cv2
import sys

caffe_root='/home/ubuntu/caffe-master/'
sys.path.insert(0,caffe_root+'python') #add this python path
os.chdir(caffe_root)
MODEL_FILE=caffe_root+'mytest/my-mnist/classificat_net.prototxt'
WEIGTHS=caffe_root+'mytest/my-mnist/lenet_iter_10000.caffemodel'
MEAN_FILE=caffe_root+'mytest/my-mnist/mean.binaryproto'
print('Params loaded!')
cv2.waitKey(1000)
caffe.set_mode_gpu()
net=caffe.Net(MODEL_FILE,WEIGTHS,caffe.TEST)
mean_blob=caffe.proto.caffe_pb2.BlobProto()
mean_blob.ParseFromString(open(MEAN_FILE, 'rb').read())
mean_npy = caffe.io.blobproto_to_array(mean_blob)
a=mean_npy[0, :, 0, 0]
print(net.blobs['data'].data.shape)

transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
#transformer.set_transpose('data', (2, 0, 1))
##transformer.set_raw_scale('data', 255)
#transformer.set_channel_swap('data', (2, 1, 0))
for i in range(0,10):
  IMAGE_PATH=caffe_root+'mytest/my-mnist/{}.png'.format(i)

  #im = caffe.io.load_image(IMAGE_PATH)
  input_image=cv2.imread(IMAGE_PATH,cv2.IMREAD_GRAYSCALE).astype(np.float32)
  resized=cv2.resize(input_image,(280,280),None,0,0,cv2.INTER_AREA)
  net.blobs['data'].data[...] = transformer.preprocess('data', input_image)
  predict = net.forward()
  names = []
  with open('/home/ubuntu/caffe-master/mytest/my-mnist/words.txt', 'r+') as f:
    for l in f.readlines():
        names.append(l.split(' ')[1].strip())

  print(names)
  prob = net.blobs['prob'].data[0].flatten()
  print('prob: ', prob)
  print('class: ', names[np.argmax(prob)])
  cv2.imshow("Prediction", resized)
  keycode = cv2.waitKey(0) & 0xFF
  if keycode == 27:
    break

下面以代码一和代码二简称。
代码一关键的地方是

net=caffe.Classifier(MODEL_FILE,WEIGTHS)
prediction = net.predict([input_image], oversample=False)

可以看出是利用opencv读入灰度图加入到net中进行预测
代码二关键的地方是

net=caffe.Net(MODEL_FILE,WEIGTHS,caffe.TEST)
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
#transformer.set_transpose('data', (2, 0, 1))
##transformer.set_raw_scale('data', 255)
#transformer.set_channel_swap('data', (2, 1, 0))
#img = caffe.io.load_image(IMAGE_PATH)
input_image=cv2.imread(IMAGE_PATH,cv2.IMREAD_GRAYSCALE).astype(np.float32)
net.blobs['data'].data[...] = transformer.preprocess('data', input_image)
predict = net.forward()

可以看到我注释掉许多代码,但是又很关键因此我没有删掉。对于transformer,是对data进行变换。但是利用caffe.io.load_image(IMAGE_PATH)读入图片,每次都会读到[28,28,3]的矩阵,也就是每次都会读三个通道,于是出现如下错误

could not broadcast input array from shape (28,28,3) into shape (1,1,28,28)

无奈之下又使用opencv,读到的是(W,H,C)。。。望高手替我解答一下。
但是对于彩色图像应该选用caffe.io.load_image(IMAGE_PATH)
下面是运行效果

![Upload Screenshot from 2017-08-22 15:50:07.png failed. Please try again.]

对于疑问已经解决:
cv2.imread()接口读图像,读进来直接是gray 格式and 0~255,所以不需要再缩放到[0,255]和通道变换[2,1,0]不需要

transformer.set_raw_scale('data',255)和transformer.set_channel_swap('data',(2,1,0)

是caffe.io.load_image()读进来是RGB格式和0~1(float)所以在进行特征提取之前要在transformer中设置transformer.set_raw_scale('data',255)(缩放至0~255)
以及transformer.set_channel_swap('data',(2,1,0)(将RGB变换到BGR)
完毕!

你可能感兴趣的:(Caffe学习笔记1:用训练好的mnist模型进行预测(两种方法))