使用OpenCV自带的神经网络对MNIST手写字体进行识别

    在学习了斯坦福机器学习公开课中关于神经网络的章节之后,一直想自己实现一个神经网络,但是在没有对照的情况下,实在是无法验证自己写的程序对不对,因此,必须先找一个已经写好的神经网络来进行对比实验,刚好OpenCV中就有神经网络,于是,先使用OpenCV自带的神经网络对MNIST手写字体进行识别。
    在OpenCV中,使用神经网络需要先设置一些关于神经网络的参数。
    OpenCV中的神经网络是由类CvANN_MLP定义的,可以使用CvANN_MLP的构造函数或create函数产生一个神经网络,在产生神经网络的过程中,需要的参数如下:
使用OpenCV自带的神经网络对MNIST手写字体进行识别_第1张图片

    因此,需要先设置神经网络每一层的神经元的个数,包括输入层和输出层,神经网络的激活函数,以及激活函数中对应的参数的值。
    在这里,我主要选择sigmoid函数,sigmoid函数的参数都设置为1.

    接下来,看一下神经网络的训练函数:
使用OpenCV自带的神经网络对MNIST手写字体进行识别_第2张图片
    从函数的参数列表中可以看出,训练的过程需要一个CvANN_MLP_TrainParams类型的训练参数。
    CvANN_MLP_TrainParams类型的说明如下:
使用OpenCV自带的神经网络对MNIST手写字体进行识别_第3张图片
    我主要选择的是BP算法,因为另外一个算法没有研究过。

    在了解了上面的需要设置的参数以后,就可以开始使用OpenCV的神经网络了。具体代码如下:
#include <opencv2/opencv.hpp>
#include <iostream>

#include "NeuralNetworksFunctions.h"

/**
 * @brief NeuralNetworksTraing Training the neural networks
 * @param NeuralNetworks   The neural network
 * @param InputMat         Floating-point matrix of input vectors, one vector
 *                         per row.
 * @param OutputMat        Floating-point matrix of the corresponding output
 *                         vectors, one vector per row.
 * @param MaxIte           The number of the iteration
 *
 *
 * @author sheng
 * @version 1.0.0
 * @date  2014-04-17
 *
 * @histroy     <author>      <date>      <version>      <description>
 *               sheng      2014-04-08      1.0.0      build the function
 *
 */

void NeuralNetworksTraing(CvANN_MLP& NeuralNetworks, const cv::Mat& InputMat,
                          const cv::Mat& OutputMat, int MaxIte)
{
    // Network architecture
    std::vector<int> LayerSizes;
    LayerSizes.push_back(InputMat.cols);    // input layer
    LayerSizes.push_back(1000);             // hidden layer has 1000 neurons
    LayerSizes.push_back(OutputMat.cols);   // output layer


    // Activate function
    int ActivateFunc = CvANN_MLP::SIGMOID_SYM;
    double Alpha = 1;
    double Beta = 1;


    // create the network
    NeuralNetworks.create(cv::Mat(LayerSizes), ActivateFunc, Alpha, Beta);



    // Training Params
    CvANN_MLP_TrainParams TrainParams;
    TrainParams.train_method = CvANN_MLP_TrainParams::BACKPROP;
    TrainParams.bp_dw_scale = 0.0001;
    TrainParams.bp_moment_scale = 0;

    // iteration number
    CvTermCriteria TermCrlt;
    TermCrlt.type = CV_TERMCRIT_ITER | CV_TERMCRIT_EPS;
    TermCrlt.epsilon = 0.0001f;
    TermCrlt.max_iter = MaxIte;
    TrainParams.term_crit = TermCrlt;


    // Training the networks
    NeuralNetworks.train(InputMat, OutputMat, cv::Mat(), cv::Mat(), TrainParams);

}




/**
 * @brief NeuralNetworksPredict
 * @param NeuralNetworks
 * @param Input
 * @param Output
 */
void NeuralNetworksPredict(const CvANN_MLP& NeuralNetworks, const cv::Mat& Input,
                           cv::Mat& Output)
{
    // Neural network predict
    cv::Mat OutputVector;
    NeuralNetworks.predict(Input, OutputVector);

    // change the output vector
    DecodeOutputVector(OutputVector, Output, OutputVector.cols);


}


下面是测试代码:
#include "NeuralNetworksFunctions.h"
#include "MNIST.h"
#include "timer.h"



void Test_NeuralNetwork()
{


    // prepare the training data
    std::string TrainingImageFileName =
            "train-images.idx3-ubyte";
    cv::Mat TrainingImages = ReadImages(TrainingImageFileName);
    cv::Mat FloatTrainingImages;
    ConvertToFloatMat(TrainingImages, FloatTrainingImages);

    // normalizing the training samples
    FloatTrainingImages = (255 - FloatTrainingImages) / 255;


    // prepare the training label
    std::string TrainingLabelFileName =
            "train-labels.idx1-ubyte";
    cv::Mat TrainingLabels = ReadLabels(TrainingLabelFileName);
    cv::Mat TrainingLabelVector;
    EncodeOutputVector(TrainingLabels, TrainingLabelVector, 10);


    // defining the network
    CvANN_MLP Networks;


    // The number of iteration
    int MaxIte = 2;


    // training
    Utility::Timer NetworksTimer;
    std::cout << "Training is started." << std::endl;
    NetworksTimer.Start();

    NeuralNetworksTraing(Networks, FloatTrainingImages, TrainingLabelVector,
                         MaxIte);

    NetworksTimer.Finish();
    std::cout << "Training is end." << std::endl;
    std::cout << "The training time is " << NetworksTimer.GetDuration()
              << std::endl;


    // save the networks
    Networks.save("NerualNetworks-ite=2-1000hidden.xml");



    // prapare the testing data
    std::string TestingImageFileName =
           "t10k-images.idx3-ubyte";
    cv::Mat TestingImages = ReadImages(TestingImageFileName);
    cv::Mat FloatTestingImages;
    ConvertToFloatMat(TestingImages, FloatTestingImages);

    // normalizing the testing samples
    FloatTestingImages = (255 - FloatTestingImages) / 255;


    // predicting
    cv::Mat NetworkOutput;
    NeuralNetworksPredict(Networks, FloatTestingImages, NetworkOutput);


    // the actural output of the testing samples
    std::string TestingLabelFileName =
           "t10k-labels.idx1-ubyte";
    cv::Mat TestingLabels = ReadLabels(TestingLabelFileName);



    // calculating the number of the correct testing samples
    int NumberOfCorrect = 0;

    for (cv::MatIterator_<uchar> NetworkOutputIte = NetworkOutput.begin<uchar>(),
         ActuralOutputIte = TestingLabels.begin<uchar>();
         NetworkOutputIte != NetworkOutput.end<uchar>();
         NetworkOutputIte++, ActuralOutputIte++)
    {
        if ((*NetworkOutputIte) == (*ActuralOutputIte))
        {
            NumberOfCorrect++;
        }
    }



    float Rate = 0;
    if (NetworkOutput.rows == 0)
    {
        std::cout << "The testing samples is zero." << std::endl;
    }
    else
    {
        Rate = (float)(NumberOfCorrect) / NetworkOutput.rows;
    }

    std::cout << "The number of correct is " << NumberOfCorrect << std::endl;
    std::cout << "The number of the testing is " << NetworkOutput.rows << std::endl;
    std::cout << "The correct of the network is " << Rate << std::endl;


}

    在设置了使用BP算法,迭代次数为2,隐藏层神经元为1000的情况下,得到下面的结果
使用OpenCV自带的神经网络对MNIST手写字体进行识别_第4张图片


   实验的效果还好,有90%的正确率。













你可能感兴趣的:(C++,学习笔记,opencv,神经网络)