在之前的一篇博客(https://blog.csdn.net/zhebushibiaoshifu/article/details/114806478)中,我们对基于MATLAB的随机森林(RF)回归与变量影响程度(重要性)排序代码加以详细讲解与实践。本次我们继续基于MATLAB,对另一种常用的机器学习方法——神经网络方法加以代码实战。
首先需要注明的是,在MATLAB中,我们可以直接基于“APP”中的“Neural Net Fitting”工具箱实现在无需代码的情况下,对神经网络算法加以运行:
基于工具箱的神经网络方法虽然方便,但是一些参数不能调整;同时也不利于我们对算法、代码的理解。因此,本文不利用“Neural Net Fitting”工具箱,而是直接通过代码将神经网络方法加以运行——但是,本文的代码其实也是通过上述工具箱运行后生成的;而这种生成神经网络代码的方法也是MATLAB官方推荐的方式。
另外,需要注意的是,本文直接进行神经网络算法的执行,省略了前期数据处理、训练集与测试集划分、精度衡量指标选取等。因此建议大家先将这一篇博客(https://blog.csdn.net/zhebushibiaoshifu/article/details/114806478)阅读后,再阅读本文。
本文分为两部分,首先是将代码分段、详细讲解,方便大家理解;随后是完整代码,方便大家自行尝试。
1 分解代码
1.1 循环准备
由于机器学习往往需要多次执行,我们就在此先定义循环。
%% ANN Cycle Preparation
ANNRMSE=9999;
ANNRunNum=0;
ANNRMSEMatrix=[];
ANNrAllMatrix=[];
while ANNRMSE>400
其中,ANNRMSE
是初始的RMSE;ANNRunNum
是神经网络算法当前运行的次数;ANNRMSEMatrix
用来存储每一次神经网络运行后所得到的RMSE结果;ANNrAllMatrix
用来存储每一次神经网络运行后所得到的皮尔逊相关系数结果;最后一句表示当所得到的模型RMSE>400
时,则停止循环。
1.2 神经网络构建
接下来,我们对神经网络的整体结构加以定义。
%% ANN
x=TrainVARI';
t=TrainYield';
trainFcn = 'trainlm';
hiddenLayerSize = [10 10 10];
ANNnet = fitnet(hiddenLayerSize,trainFcn);
其中,TrainVARI
、TrainYield
分别是我这里训练数据的自变量(特征)与因变量(标签);trainFcn
为神经网络所选用的训练函数方法名称,其名称与对应的方法对照如下表:
hiddenLayerSize
为神经网络所用隐层与各层神经元个数,[10 10 10]
代表共有三层隐层,各层神经元个数分别为10,10,10。
1.3 数据处理
接下来,对输入神经网络模型的数据加以处理。
ANNnet.input.processFcns = {'removeconstantrows','mapminmax'};
ANNnet.output.processFcns = {'removeconstantrows','mapminmax'};
ANNnet.divideFcn = 'dividerand';
ANNnet.divideMode = 'sample';
ANNnet.divideParam.trainRatio = 0.6;
ANNnet.divideParam.valRatio = 0.4;
ANNnet.divideParam.testRatio = 0.0;
其中,ANNnet.input.processFcns
与ANNnet.output.processFcns
分别代表输入模型数据的处理方法,'removeconstantrows'
表示删除在各样本中数值始终一致的特征列,'mapminmax'
表示将数据归一化处理;divideFcn
表示划分数据训练集、验证集与测试集的方法,'dividerand'
表示依据所给定的比例随机划分;divideMode
表示对数据划分的维度,我们这里选择'sample'
,也就是对样本进行划分;divideParam
表示训练集、验证集与测试集所占比例,那么在这里,因为是直接用了先前随机森林方法(可以看这篇博客)中的数据划分方式,那么为了保证训练集、测试集的固定,我们就将divideParam.testRatio
设置为0.0
,然后将训练集与验证集比例划分为0.6
与0.4
。
1.4 模型训练参数配置
接下来对模型运行过程中的主要参数加以配置。
ANNnet.performFcn = 'mse';
ANNnet.trainParam.epochs=5000;
ANNnet.trainParam.goal=0.01;
其中,performFcn
为模型误差衡量函数,'mse'
表示均方误差;trainParam.epochs
表示训练时Epoch次数,trainParam.goal
表示模型所要达到的精度要求(即模型运行到trainParam.epochs
次时或误差小于trainParam.goal
时将会停止运行。
1.5 神经网络实现
这一部分代码大多数与绘图、代码与GUI生成等相关,因此就不再一一解释了,大家可以直接运行。需要注意的是,train
是模型训练函数。
% For a list of all plot functions type: help nnplot
ANNnet.plotFcns = {'plotperform','plottrainstate','ploterrhist','plotregression','plotfit'};
[ANNnet,tr] = train(ANNnet,x,t);
y = ANNnet(x);
e = gsubtract(t,y);
performance = perform(ANNnet,t,y);
% Recalculate Training, Validation and Test Performance
trainTargets = t .* tr.trainMask{1};
valTargets = t .* tr.valMask{1};
testTargets = t .* tr.testMask{1};
trainPerformance = perform(ANNnet,trainTargets,y);
valPerformance = perform(ANNnet,valTargets,y);
testPerformance = perform(ANNnet,testTargets,y);
% view(net)
% Plots
%figure, plotperform(tr)
%figure, plottrainstate(tr)
%figure, ploterrhist(e)
%figure, plotregression(t,y)
%figure, plotfit(net,x,t)
% Deployment
% See the help for each generation function for more information.
if (false)
% Generate MATLAB function for neural network for application
% deployment in MATLAB scripts or with MATLAB Compiler and Builder
% tools, or simply to examine the calculations your trained neural
% network performs.
genFunction(ANNnet,'myNeuralNetworkFunction');
y = myNeuralNetworkFunction(x);
end
if (false)
% Generate a matrix-only MATLAB function for neural network code
% generation with MATLAB Coder tools.
genFunction(ANNnet,'myNeuralNetworkFunction','MatrixOnly','yes');
y = myNeuralNetworkFunction(x);
end
if (false)
% Generate a Simulink diagram for simulation or deployment with.
% Simulink Coder tools.
gensim(ANNnet);
end
1.6 精度衡量
%% Accuracy of ANN
ANNPredictYield=sim(ANNnet,TestVARI')';
ANNRMSE=sqrt(sum(sum((ANNPredictYield-TestYield).^2))/size(TestYield,1));
ANNrMatrix=corrcoef(ANNPredictYield,TestYield);
ANNr=ANNrMatrix(1,2);
ANNRunNum=ANNRunNum+1;
ANNRMSEMatrix=[ANNRMSEMatrix,ANNRMSE];
ANNrAllMatrix=[ANNrAllMatrix,ANNr];
disp(ANNRunNum);
end
disp(ANNRMSE);
其中,ANNPredictYield
为预测结果;ANNRMSE
、ANNrMatrix
分别为模型精度衡量指标RMSE与皮尔逊相关系数。结合本文1.1部分可知,我这里设置为当所得神经网络模型RMSE在400以内时,将会停止循环;否则继续开始执行本文1.2部分至1.6部分的代码。
1.7 保存模型
这一部分就不再赘述了,大家可以参考这篇博客(https://blog.csdn.net/zhebushibiaoshifu/article/details/114806478)。
%% ANN Model Storage
ANNModelSavePath='G:\CropYield\02_CodeAndMap\00_SavedModel\';
save(sprintf('%sRF0417ANN0399.mat',ANNModelSavePath),'TestVARI','TestYield','TrainVARI','TrainYield','ANNnet','ANNPredictYield','ANNr','ANNRMSE',...
'hiddenLayerSize');
2 完整代码
完整代码如下:
%% ANN Cycle Preparation
ANNRMSE=9999;
ANNRunNum=0;
ANNRMSEMatrix=[];
ANNrAllMatrix=[];
while ANNRMSE>1000
%% ANN
x=TrainVARI';
t=TrainYield';
trainFcn = 'trainlm';
hiddenLayerSize = [10 10 10];
ANNnet = fitnet(hiddenLayerSize,trainFcn);
ANNnet.input.processFcns = {'removeconstantrows','mapminmax'};
ANNnet.output.processFcns = {'removeconstantrows','mapminmax'};
ANNnet.divideFcn = 'dividerand';
ANNnet.divideMode = 'sample';
ANNnet.divideParam.trainRatio = 0.6;
ANNnet.divideParam.valRatio = 0.4;
ANNnet.divideParam.testRatio = 0.0;
ANNnet.performFcn = 'mse';
ANNnet.trainParam.epochs=5000;
ANNnet.trainParam.goal=0.01;
% For a list of all plot functions type: help nnplot
ANNnet.plotFcns = {'plotperform','plottrainstate','ploterrhist','plotregression','plotfit'};
[ANNnet,tr] = train(ANNnet,x,t);
y = ANNnet(x);
e = gsubtract(t,y);
performance = perform(ANNnet,t,y);
% Recalculate Training, Validation and Test Performance
trainTargets = t .* tr.trainMask{1};
valTargets = t .* tr.valMask{1};
testTargets = t .* tr.testMask{1};
trainPerformance = perform(ANNnet,trainTargets,y);
valPerformance = perform(ANNnet,valTargets,y);
testPerformance = perform(ANNnet,testTargets,y);
% view(net)
% Plots
%figure, plotperform(tr)
%figure, plottrainstate(tr)
%figure, ploterrhist(e)
%figure, plotregression(t,y)
%figure, plotfit(net,x,t)
% Deployment
% See the help for each generation function for more information.
if (false)
% Generate MATLAB function for neural network for application
% deployment in MATLAB scripts or with MATLAB Compiler and Builder
% tools, or simply to examine the calculations your trained neural
% network performs.
genFunction(ANNnet,'myNeuralNetworkFunction');
y = myNeuralNetworkFunction(x);
end
if (false)
% Generate a matrix-only MATLAB function for neural network code
% generation with MATLAB Coder tools.
genFunction(ANNnet,'myNeuralNetworkFunction','MatrixOnly','yes');
y = myNeuralNetworkFunction(x);
end
if (false)
% Generate a Simulink diagram for simulation or deployment with.
% Simulink Coder tools.
gensim(ANNnet);
end
%% Accuracy of ANN
ANNPredictYield=sim(ANNnet,TestVARI')';
ANNRMSE=sqrt(sum(sum((ANNPredictYield-TestYield).^2))/size(TestYield,1));
ANNrMatrix=corrcoef(ANNPredictYield,TestYield);
ANNr=ANNrMatrix(1,2);
ANNRunNum=ANNRunNum+1;
ANNRMSEMatrix=[ANNRMSEMatrix,ANNRMSE];
ANNrAllMatrix=[ANNrAllMatrix,ANNr];
disp(ANNRunNum);
end
disp(ANNRMSE);
%% ANN Model Storage
ANNModelSavePath='G:\CropYield\02_CodeAndMap\00_SavedModel\';
save(sprintf('%sRF0417ANN0399.mat',ANNModelSavePath),'AreaPercent','InputOutput','nLeaf','nTree',...
'RandomNumber','RFModel','RFPredictConfidenceInterval','RFPredictYield','RFr','RFRMSE',...
'TestVARI','TestYield','TrainVARI','TrainYield','ANNnet','ANNPredictYield','ANNr','ANNRMSE',...
'hiddenLayerSize');