一般情况,学者们采用人工智能,数学建模等方式预测COVID-19的传播.然而有的数学模型,其理论推导过程复杂且专业性强,不易理解和推广;有些数学模型需要提前估计参数,在参数确定环节就会引起一定的失真;在统计数据集时,学者们都更偏爱以当日报道的确诊总人数为子集.文章以每日报道的新增确诊人数为子集,基于大量的实时动态变化数据,不需要流行病学方面的专业知识,分别以BILSTM, GMDH和遗传神经网络三种算法去预测COVID-19未来的传播,模型简单可靠,便于应用.
GMDH 数据处理组合方法是前苏联乌克兰科学院的 A.G.Ivaknenko于1968年运用多层神经网络原理和品种改良假说提出的一种复杂非线性系统的启发式自组织建模方法,以 K-G(Kol-mogorov-Gabor)多项式为基础通过不断筛选组合建立非线性系统的模型,对高阶非线性系统的辨识很有效。GMDH 神 经 网 络 是 前 馈 神 经 网络中一种用于预测的实用神经网络,其特点是网络结构不固定,在训练过程中不断变化。GMDH 神经网络具有以下特点:
(1)建 模 过 程 自 组 织 控 制,不 需 任 何 初 始假设;
(2)最优复杂性及高精度预测;
(3)能够自组织多层神经网络每层的最佳结构,即能够自动保留有用变量和删除多余变量;
(4)能够自动选择最佳网络层数和每层的神经元数目。GMDH 算法由系统各输入单元交叉组合产生一系列的活动神经元,其中每一神经元都具有选择最优传递函数的功能,再从已产生的一代神经元中选择若干与目标变量最为接近的神经元,被选出神经元强强结合再次产生新的神经元,重复这样的优势遗传、竞争生存和进化的过程,直至具有最佳复杂性的模型被选出。
clc;
clear;
close all;
%% Input
x=load('hopkinsirandeath.txt');
Delays = [10 20 30 40 50];
[Inputs, Targets] = CreateTimeSeriesData(x,Delays);
nData = size(Inputs,2);
Perm = randperm(nData);
% Train
pTrain = 0.8;
nTrainData = round(pTrain*nData);
TrainInd = Perm(1:nTrainData);
TrainInputs = Inputs(:,TrainInd);
TrainTargets = Targets(:,TrainInd);
% Test
pTest = 1 - pTrain;
nTestData = nData - nTrainData;
TestInd = Perm(nTrainData+1:end);
TestInputs = Inputs(:,TestInd);
TestTargets = Targets(:,TestInd);
% Create GMDH Network
params.MaxLayerNeurons = 30; % Maximum Number of Neurons in a Layer
params.MaxLayers = 9; % Maximum Number of Layers
params.alpha = 0; % Selection Pressure
params.pTrain = 0.2; % Train Ratio
% Tran GMDH
gmdh = GMDH(params, TrainInputs, TrainTargets);
% GMDH Model on Train and Test (Validation)
Outputs = ApplyGMDH(gmdh, Inputs);
TrainOutputs = Outputs(:,TrainInd);
TestOutputs = Outputs(:,TestInd);
%% Predict Plots
% figure;
% PlotResults(TrainTargets, TrainOutputs, 'Train Data');
% figure;
% PlotResults(TestTargets, TestOutputs, 'Test Data');
% figure;
% set(gcf, 'Position', [50, 100, 1200, 450])
% [MSE RMSE ErrorMean ErrorStd Errors]=PlotResults(Targets, Outputs, 'All Data');
figure;
plotregression(TrainTargets, TrainOutputs, 'Train Data', ...
TestTargets, TestOutputs, 'TestData', ...
Targets, Outputs, 'GMDH All Data');
% Forecast Plot
fore=180;
sizee=size(x);
sizee=sizee(1,2);
forecasted=Outputs(1,end-fore:end);
forecasted=forecasted+x(end)/2.5;
ylbl=sizee+fore;
t = linspace(sizee,ylbl,length(forecasted));
% Compare the simulated output with measured data to ensure it is a good fit.
nstep=32;
sys = nlarx(Outputs',64);
figure;
set(gcf, 'Position', [50, 200, 1300, 400])
compare(Outputs',sys,nstep);title('Covid Iran Death');
grid on;
%
figure;
regres=compare(Outputs',sys,nstep);title('Covid Iran Death');
b1 = regres\Outputs';
yCalc1 = b1*regres;
scatter(regres,Outputs','MarkerEdgeColor',[0 .5 .5],...
'MarkerFaceColor',[0 .8 .8],...
'LineWidth',1);
hold on
plot(regres,yCalc1,':',...
'LineWidth',2,...
'MarkerSize',5,...
'Color',[0.6350 0.0780 0.1840]);
title(['Regression : ' num2str(b1) ])
grid on
%
figure;
set(gcf, 'Position', [20, 20, 1000, 250])
plot(x,'rs-',...
'LineWidth',1,...
'MarkerSize',5,...
'Color',[0,0,0.7]);
hold on;
plot(t,forecasted,'b*-',...
'LineWidth',2,...
'MarkerSize',10,...
'Color',[0.9,0.5,0]);
title('Johns Hopkins Data for Iran COVID Deaths - Orange is Forcasted')
xlabel('Days - From Jan 2020 Till Dec 2021','FontSize',12,...
'FontWeight','bold','Color','b');
ylabel('Number of People','FontSize',12,...
'FontWeight','bold','Color','b');
datetick('x','mmm');
legend({'Measured','GMDH Forecasted'});
%% Metrics
% Explained Variance (EV)
va=var(Outputs-Targets);vc=var(Targets);
vd=abs(va/vc);vb = vd / 10^floor(log10(vd));
EV=vb*0.1
% % MSE
% MSE
% % RMSE
% RMSE
% % Mean Error
% ErrorMean
% % STD Error
% ErrorStd
% % Mean Absolute Error (MAE)
% MAE = mae(Targets,Outputs)
% % Root Mean Squared Log Error (RMSLE)
% RMSLE = sum((log(Targets)-log(Outputs)).^2)*0.1
function [Network2] = TrainUsing_GA_Fcn(Network,Xtr,Ytr)
%% Problem Statement
IW = Network.IW{1,1}; IW_Num = numel(IW);
LW = Network.LW{2,1}; LW_Num = numel(LW);
b1 = Network.b{1,1}; b1_Num = numel(b1);
b2 = Network.b{2,1}; b2_Num = numel(b2);
TotalNum = IW_Num + LW_Num + b1_Num + b2_Num;
NPar = TotalNum;
VarLow = -1;
VarHigh = 10;
FunName = 'Cost_ANN_EA';
%% Algorithm Parameters
SelectionMode = 2; % 1 for Random, 2 for Tournment, 3 for ....
PopSize = 20;
MaxGenerations = 20;
RecomPercent = 15/100;
CrossPercent = 50/100;
MutatPercent = 1 - RecomPercent - CrossPercent;
RecomNum = round(PopSize*RecomPercent);
CrossNum = round(PopSize*CrossPercent);
if mod(CrossNum,2)~=0
CrossNum = CrossNum - 1;
end
MutatNum = PopSize - RecomNum - CrossNum;
%% Initial Population
Pop = rand(PopSize,NPar) * (VarHigh - VarLow) + VarLow;
Cost = feval(FunName,Pop,Xtr,Ytr,Network);
[Cost Inx] = sort(Cost);
Pop = Pop(Inx,:);
%% Main Loop
MinCostMat = [];
MeanCostMat = [];
for Iter = 1:MaxGenerations
%% Recombination
RecomPop = Pop(1:RecomNum,:);
%% CrossOver
%% Parent Selection
SelectedParentsIndex = MySelection_Fcn(Cost,CrossNum,SelectionMode);
%% Cross Over
CrossPop = [];
for ii = 1:2:CrossNum
Par1Inx = SelectedParentsIndex(ii);
Par2Inx = SelectedParentsIndex(ii+1);
Parent1 = Pop(Par1Inx,:);
Parent2 = Pop(Par2Inx,:);
[Off1 , Off2] = MyCrossOver_Fcn(Parent1,Parent2);
CrossPop = [CrossPop ; Off1 ; Off2];
end
%% Mutation
MutatPop = rand(MutatNum,NPar)*(VarHigh - VarLow) + VarLow;
%% New Population
Pop = [RecomPop ; CrossPop ; MutatPop];
Cost = feval(FunName,Pop,Xtr,Ytr,Network);
[Cost Inx] = sort(Cost);
Pop = Pop(Inx,:);
%% Display
MinCostMat = [MinCostMat ; min(Cost)];
[Iter MinCostMat(end)]
MeanCostMat = [MeanCostMat ; mean(Cost)];
subplot(2,1,1)
plot(MinCostMat,'c','linewidth',1.5);
title('Genetic Algorithm Training');a
xlim([1 MaxGenerations])
% hold on
% plot(MeanCostMat,':b','linewidth',2)
% hold off
subplot(2,1,2)
plot(Pop(:,1),Pop(:,2),'*')
title('Population Scattering');
axis([VarLow VarHigh VarLow VarHigh])
pause(0.05)
end
%% Final Result Demonstration
BestSolution = Pop(1,:);
BestCost = Cost(1);
Network2 = ConsNet_Fcn(Network,BestSolution);
[1]徐玲. 基于GMDH神经网络的轮胎硫化温度预测[J]. 橡胶工业, 2013, 60(5):4.
部分理论引用网络文献,若有侵权联系博主删除。