随着计算机技术和信息技术的迅猛发展,人们的生活沉浸在了数据和信息的海洋,人类社会已步入数字时代。人们需要丰富的数据和信息引导社会活动和方便生活,尤其在商业领域、企业生产过程和诸多工程领域等要以大量的数据信息为基础,获取商业利润的决策信息,实现工业生产的在线监控、辨识、诊断,以及完成控制策略的构思。因此,数据挖掘的概念应运而生,它作为一种能够有效地从海量数据信息中挖掘潜在的有用的价值信息的技术手段,得到了诸多科研人员关注和许多工程领域的应用。 入侵杂草算法(IWO)是一种能够有效模仿野草繁殖、生长、竞争过程的仿生学优化算法。由于其鲁棒性好、寻优能力强、收敛速度快、结构简单、易于实现,在许多优化问题上优于其他智能优化算法,受到了学术界的广泛关注。
IWO是2006年由A. R. Mehrabian等提出的一种从自然界杂草进化原理演化而来的随机搜索算法,模仿杂草入侵的种子空间扩散、生长、繁殖和竞争性消亡的基本过程,具有很强的鲁棒性和自适应性。
杂草是环境中生长的对农作物、景观、生态系统甚至人类构成威胁的植物的统 称。人类与杂草的对抗由来已久,自耕作活动出现后杂草就一直生长在田间地头影响栽培植物的生长繁殖,尽管在科技发达的今天,各种物理方法( 如火燎法) 、化学方法( 如除草剂) 、生物方法( 如养杂草宿敌) 抑或是生态系统算法都无法消除杂草的踪迹。入侵杂草具有超强的生存能力和环境适应能力,它总能在农季开始的最佳时间吸收土壤中大量的水分,造成土地干燥贫瘠影响土著植物的生长繁殖,有些杂草能通过释放化合物影响周围植物种子或幼苗的成长进而能够占据某一领地成功存活生长。杂草成熟后繁殖的种子通过风、水或者自身爆破等方式散布到其他场地,待到下一生长期,开始新一轮的扩张,不断寻找光线好、养分足等环境更优越的生长区域。IWO 算法就是受杂草繁殖扩张过程的启发而构造出的一种随机优化算法。在 IWO 算法中,入侵杂草的生长区域视为问题的可行域; 每一杂草代表问题的一个可行解; 杂草的适应度( 杂草的生长环境质量) 反映解的优劣; 通过杂草的种子繁殖、空间扩散、竞争择优三个主要操作模拟整个寻找最优解的过程,以及建立杂草种群与解集的对应关系。
% Invasive Weed Optimization (IWO) Clustering
% Created By Seyed Muhammad Hossein Mousavi - 2022
% Comparison with K-means and GMM
clc;
clear;
close all;
warning('off');
%% Basics
% Loading
data = load('dat');
X = data.XX;
%
k = 3; % Number of Clusters
%
CostFunction=@(m) ClusterCost(m, X); % Cost Function
VarSize=[k size(X,2)]; % Decision Variables Matrix Size
nVar=prod(VarSize); % Number of Decision Variables
VarMin= repmat(min(X),k,1); % Lower Bound of Variables
VarMax= repmat(max(X),k,1); % Upper Bound of Variables
%% IWO Params
MaxIt = 25; % Maximum Number of Iterations
nPop0 = 2; % Initial Population Size
nPop = 5; % Maximum Population Size
Smin = 2; % Minimum Number of Seeds
Smax = 5; % Maximum Number of Seeds
Exponent = 1.5; % Variance Reduction Exponent
sigma_initial = 0.2; % Initial Value of Standard Deviation
sigma_final = 0.001; % Final Value of Standard Deviation
%% Intro
% Empty Plant Structure
empty_plant.Position = [];
empty_plant.Cost = [];
empty_plant.Out = [];
pop = repmat(empty_plant, nPop0, 1); % Initial Population Array
for i = 1:numel(pop)
% Initialize Position
pop(i).Position = unifrnd(VarMin, VarMax, VarSize);
% Evaluation
[pop(i).Cost, pop(i).Out]= CostFunction(pop(i).Position);
end
% Best Solution Ever Found
BestSol = pop(1);
% Initialize Best Cost History
BestCosts = zeros(MaxIt, 1);
%% IWO Main Body
for it = 1:MaxIt
% Update Standard Deviation
sigma = ((MaxIt - it)/(MaxIt - 1))^Exponent * (sigma_initial - sigma_final) + sigma_final;
% Get Best and Worst Cost Values
Costs = [pop.Cost];
BestCost = min(Costs);
WorstCost = max(Costs);
% Initialize Offsprings Population
newpop = [];
% Reproduction
for i = 1:numel(pop)
ratio = (pop(i).Cost - WorstCost)/(BestCost - WorstCost);
S = floor(Smin + (Smax - Smin)*ratio);
for j = 1:S
% Initialize Offspring
newsol = empty_plant;
% Generate Random Location
newsol.Position = pop(i).Position + sigma * randn(VarSize);
% Apply Lower/Upper Bounds
newsol.Position = max(newsol.Position, VarMin);
newsol.Position = min(newsol.Position, VarMax);
% Evaluate Offsring
[newsol.Cost, newsol.Out] = CostFunction(newsol.Position);
% Add Offpsring to the Population
newpop = [newpop
newsol];
end
end
% Merge Populations
pop = [pop
newpop];
% Sort Population
[~, SortOrder] = sort([pop.Cost]);
pop = pop(SortOrder);
% Competitive Exclusion (Delete Extra Members)
if numel(pop)>nPop
pop = pop(1:nPop);
end
% Store Best Solution Ever Found
BestSol = pop(1);
% Store Best Cost History
BestCosts(it) = BestSol.Cost;
% Display Iteration Information
disp(['Iteration ' num2str(it) ': Best Cost = ' num2str(BestCosts(it))]);
% Plot
DECenters=PlotRes(X, BestSol);
pause(0.01);
end
%% Plot IWO Train
figure;
semilogy(BestCosts, 'LineWidth', 2);
xlabel('Iteration');
ylabel('Best Cost');
grid on;
DElbl=BestSol.Out.ind;
%% K-Means Clustering for Comparison
[kidx,KCenters] = kmeans(X,k);
figure
set(gcf, 'Position', [150, 50, 700, 400])
subplot(2,3,1)
gscatter(X(:,1),X(:,2),kidx);title('K-Means')
hold on;
plot(KCenters(:,1),KCenters(:,2),'ok','LineWidth',2,'MarkerSize',6);
subplot(2,3,2)
gscatter(X(:,1),X(:,3),kidx);title('K-Means')
hold on;
plot(KCenters(:,1),KCenters(:,3),'ok','LineWidth',2,'MarkerSize',6);
subplot(2,3,3)
gscatter(X(:,1),X(:,4),kidx);title('K-Means')
hold on;
plot(KCenters(:,1),KCenters(:,4),'ok','LineWidth',2,'MarkerSize',6);
subplot(2,3,4)
gscatter(X(:,2),X(:,3),kidx);title('K-Means')
hold on;
plot(KCenters(:,2),KCenters(:,3),'ok','LineWidth',2,'MarkerSize',6);
subplot(2,3,5)
gscatter(X(:,2),X(:,4),kidx);title('K-Means')
hold on;
plot(KCenters(:,2),KCenters(:,4),'ok','LineWidth',2,'MarkerSize',6);
subplot(2,3,6)
gscatter(X(:,3),X(:,4),kidx);title('K-Means')
hold on;
plot(KCenters(:,3),KCenters(:,4),'ok','LineWidth',2,'MarkerSize',6);
%
KMeanslbl=kidx;
%% Gaussian Mixture Model Clustering for Comparison
options = statset('Display','final');
gm = fitgmdist(X,k,'Options',options)
idx = cluster(gm,X);
figure
set(gcf, 'Position', [50, 300, 700, 400])
subplot(2,3,1)
gscatter(X(:,1),X(:,2),idx);title('GMM')
hold on;
subplot(2,3,2)
gscatter(X(:,1),X(:,3),idx);title('GMM')
hold on;
subplot(2,3,3)
gscatter(X(:,1),X(:,4),idx);title('GMM')
hold on;
subplot(2,3,4)
gscatter(X(:,2),X(:,3),idx);title('GMM')
hold on;
subplot(2,3,5)
gscatter(X(:,2),X(:,4),idx);title('GMM')
hold on;
subplot(2,3,6)
gscatter(X(:,3),X(:,4),idx);title('GMM')
hold on;
%
GMMlbl=idx;
%% MAE and MSE Errors
IWO_GMM_MAE=mae(DElbl,GMMlbl);
IWO_KMeans_MAE=mae(DElbl,KMeanslbl);
GMM_KMeans_MAE=mae(GMMlbl,KMeanslbl);
IWO_GMM_MSE=mse(DElbl,GMMlbl);
IWO_KMeans_MSE=mse(DElbl,KMeanslbl);
GMM_KMeans_MSE=mse(GMMlbl,KMeanslbl);
fprintf('IWO vs GMM MAE = %0.4f.\n',IWO_GMM_MAE)
fprintf('IWO vs K-Means MAE = %0.4f.\n',IWO_KMeans_MAE)
fprintf('GMM vs K-Means MAE = %0.4f.\n',GMM_KMeans_MAE)
fprintf('IWO vs GMM MSE = %0.4f.\n',IWO_GMM_MSE)
fprintf('IWO vs K-Means MSE = %0.4f.\n',IWO_KMeans_MSE)
fprintf('GMM vs K-Means MSE = %0.4f.\n',GMM_KMeans_MSE)
function m=PlotRes(X, sol)
% Cluster Centers
m = sol.Position;
k = size(m,1);
% Cluster Indices
ind = sol.Out.ind;
Colors = hsv(k);
for j=1:k
Xj = X(ind==j,:);
subplot(2,3,1)
plot(Xj(:,1),Xj(:,2),'x','LineWidth',1,'Color',Colors(j,:));title('IWO');
hold on;
% plot(m(:,1),m(:,2),'ok','LineWidth',2,'MarkerSize',6);
subplot(2,3,2)
plot(Xj(:,1),Xj(:,3),'x','LineWidth',1,'Color',Colors(j,:));title('IWO');
hold on;
% plot(m(:,1),m(:,3),'ok','LineWidth',2,'MarkerSize',6);
subplot(2,3,3)
plot(Xj(:,1),Xj(:,4),'x','LineWidth',1,'Color',Colors(j,:));title('IWO');
hold on;
% plot(m(:,1),m(:,4),'ok','LineWidth',2,'MarkerSize',6);
subplot(2,3,4)
plot(Xj(:,2),Xj(:,3),'x','LineWidth',1,'Color',Colors(j,:));title('IWO');
hold on;
% plot(m(:,2),m(:,3),'ok','LineWidth',2,'MarkerSize',6);
subplot(2,3,5)
plot(Xj(:,2),Xj(:,4),'x','LineWidth',1,'Color',Colors(j,:));title('IWO');
hold on;
% plot(m(:,2),m(:,4),'ok','LineWidth',2,'MarkerSize',6);
subplot(2,3,6)
plot(Xj(:,3),Xj(:,4),'x','LineWidth',1,'Color',Colors(j,:));title('IWO');
hold on;
% plot(m(:,3),m(:,4),'ok','LineWidth',2,'MarkerSize',6);
end
hold off;
end
[1]戈国华, 肖海波, 张敏. 基于FCM的数据聚类分析及Matlab实现[J]. 福建电脑, 2007(4):2.
[2]王金永. 基于粒子群优化算法的聚类分析研究[D]. 青岛理工大学.
部分理论引用网络文献,若有侵权联系博主删除。