DNN模块官方文档:https://docs.opencv.org/3.4.3/d6/d0f/group__dnn.html#ga29d0ea5e52b1d1a6c2681e3f7d68473a
1.OpenCV3.4.3 DNN模块介绍
最早在OpenCV3.3版本发布中,把DNN模块从扩展模块移到了OpenCV正式发布模块中,当前DNN模块最早来自Tiny-dnn,可以加载预先训练好的Caffe模型数据,OpenCV做了近一步扩展支持所有主流的深度学习框架训练生成与导出模型数据加载,常见的有如下:
Caffe
TensorFlow
Torch/PyTorch
以GoogleNet Caffe模型为例。
一般需要两个文件:1)模型参数 2)模型配置文件即模型框架 3)ImageNet标签文件
bvlc_googlenet.caffemodel
bvlc_googlenet.prototxt
synset_words.txt
1)下载googlenet caffemodel文件:http://dl.caffe.berkeleyvision.org/
2)bvlc_googlenet.prototxt文件下载:https://github.com/opencv/opencv_extra/blob/master/testdata/dnn/bvlc_googlenet.prototxt
3)ImageNet标签文件下载:http://ddl.escience.cn/f/QPnb
//use opencv_dnn module for image classification by using GoogLeNet trained network
#include
#include
#include
using namespace cv;
using namespace cv::dnn;
using namespace std;
String modelTxt = "bvlc_googlenet.prototxt";
String modelBin = "bvlc_googlenet.caffemodel";
String labelFile = "synset_words.txt";
vector readClasslabels();
int main(int argc, char** argv) {
Mat testImage = imread("monky.jpg");
if (testImage.empty()) {
printf("could not load image...\n");
return -1;
}
// create googlenet with caffemodel text and bin
Net net = dnn::readNetFromCaffe(modelTxt, modelBin);
if (net.empty())
{
std::cerr << "Can't load network by using the following files: " << std::endl;
std::cerr << "prototxt: " << modelTxt << std::endl;
std::cerr << "caffemodel: " << modelBin << std::endl;
return -1;
}
// 读取分类数据
vector labels = readClasslabels();
//GoogLeNet accepts only 224x224 RGB-images
Mat inputBlob = blobFromImage(testImage, 1, Size(224, 224), Scalar(104, 117, 123));//mean: Scalar(104, 117, 123)
// 支持1000个图像分类检测
Mat prob;
// 循环10+
for (int i = 0; i < 10; i++)
{
// 输入
net.setInput(inputBlob, "data");
// 分类预测
prob = net.forward("prob");
}
// 读取分类索引,最大与最小值
Mat probMat = prob.reshape(1, 1); //reshape the blob to 1x1000 matrix // 1000个分类
Point classNumber;
double classProb;
minMaxLoc(probMat, NULL, &classProb, NULL, &classNumber); // 可能性最大的一个
int classIdx = classNumber.x; // 分类索引号
printf("\n current image classification : %s, possible : %.2f \n", labels.at(classIdx).c_str(), classProb);
putText(testImage, labels.at(classIdx), Point(20, 20), FONT_HERSHEY_SIMPLEX, 0.75, Scalar(0, 0, 255), 2, 8);
imshow("Image Category", testImage);
waitKey(0);
return 0;
}
/* 读取图像的1000个分类标记文本数据 */
vector readClasslabels() {
std::vector classNames;
std::ifstream fp(labelFile);
if (!fp.is_open())
{
std::cerr << "File with classes labels not found: " << labelFile << std::endl;
exit(-1);
}
std::string name;
while (!fp.eof())
{
std::getline(fp, name);
if (name.length())
classNames.push_back(name.substr(name.find(' ') + 1));
}
fp.close();
return classNames;
}
结果:
代码地址:https://github.com/liuzheCSDN/OpenCV/tree/master/DNN_bvlc_googlenet
https://github.com/BVLC/caffe这个内容很丰富,有空要去了解了解!
1、readNetFromCaffe()函数:读取Caffe训练得到的模型。
String modelConfiguration = "D:\\Program Files\\OpenCV\\opencv\\sources\\samples\\dnn\\face_detector\\deploy.prototxt";
String modelBinary = "D:\\Program Files\\OpenCV\\opencv\\sources\\samples\\dnn\\face_detector\\res10_300x300_ssd_iter_140000_fp16.caffemodel";
//! [Initialize network]
dnn::Net net = readNetFromCaffe(modelConfiguration, modelBinary);//Reads a network model stored in Caffe model in memory
2、blobFromImage函数
四维矩阵:
分别对两张图像进行读入,每张彩色图像存储为一个三维矩阵(width*height*channel)
eg: img1=imread('1.jpg');img2=imread('2.jpg');
把这两张图片存入一个四维矩阵中(第四维是2):
I=imread('1.jpg');J=imread('3.jpg'); %两幅图大小必须一样
K(:,:,:,1)=I;K(:,:,:,2)=J;
>> size(K)
ans =256 256 3 2
注:size(A)可以得到矩阵A的大小, length(size(A))可以得到矩阵A的维数。
从图像中创建4维斑点。可选择从中心调整大小和裁剪 image,减去平均值mean,按比例因子scalefactor缩放,交换蓝色和红色通道。
Mat blobFromImage(InputArray image, double scalefactor=1.0, const Size& size = Size(),
const Scalar& mean = Scalar(), bool swapRB=true, bool crop=true,
int ddepth=CV_32F);
image: 输入图像(带有1个,3个或4个通道)。
size: 输出图像的空间大小
mean : 从通道中减去的平均值的标量?(scalar with mean values which are subtracted from channels.)。 如果图像具有BGR排序且swapRB为真,则值应为(mean-R,mean-G,mean-B)顺序。
scalefactor : 图像值的比例因子乘数。.
swapRB: 如果图像具有BGR排序且swapRB为真,则值应为(mean-R,mean-G,mean-B)顺序。
crop :裁剪标志,指示是否在调整大小后裁剪图像
ddepth :输出blob的深度。 选择CV_32F或CV_8U
if crop is true, input image is resized so one side after resize is equal to corresponding dimension in size and another one is equal or larger. Then, crop from the center is performed. If crop is false, direct resize without cropping and preserving aspect ratio is performed.(如果裁剪为真,则调整输入图像的大小,使调整大小后的一侧等于相应的尺寸大小,另一侧等于或大于大小。 然后,执行从中心的裁剪。 如果裁剪为假,则执行直接调整大小而不裁剪并保留纵横比。)
函数返回一个 4维 Mat 矩阵( [N,C,H,W] dimensions.),4 dimensional array (images, channels, height, width) in floating point precision (CV_32F) from which you would like to extract the images.
https://docs.opencv.org/3.4.3/db/d30/classcv_1_1dnn_1_1Net.html3、DNN中的Net结构:
3.1、setInput函数
用于设置网络的新输入值。
void setInput(InputArray blob, const String& name = "",
double scalefactor = 1.0, const Scalar& mean = Scalar());
/** @brief Sets the new input value for the network
* @param blob A new blob. Should have CV_32F or CV_8U depth.
* @param name A name of input layer.
* @param scalefactor An optional normalization scale.
* @param mean An optional mean subtraction values.
* @see connect(String, String) to know format of the descriptor.
*
* If scale or mean values are specified, a final input blob is computed
* as:
* \f[input(n,c,h,w) = scalefactor \times (blob(n,c,h,w) - mean_c)\f]
*/
blob:一个新的blob。 应该有CV_32F或CV_8U深度。
name:输入图层的名称。
scalefactor:可选的标准化比例。
mean:一个可选的平均减法值。
如果指定了scale或mean值,则最终输入blob计算为:
调用方式:
net.setInput(inputBlob, "data");
3.2、forward
前向运行计算输出层,输出层名称为outputName。
/** @brief Runs forward pass to compute output of layer with name @p outputName.
* @param outputName name for layer which output is needed to get
* @return blob for first output of specified layer.
* @details By default runs forward pass for the whole network.
*/
CV_WRAP Mat forward(const String& outputName = String());
参数
outputName:需要输出的图层的名称
返回:指定图层的第一个输出的blob。默认情况下,为整个网络运行正向传递。