MATLAB实现K-means聚类数学建模算法

K-means聚类是一种常用的无监督学习算法,用于将数据集中的观测点划分为不同的组或簇。这个算法的目标是将数据点分配到k个簇中,使得每个数据点到其所属簇的中心的距离最小化。

算法的步骤如下:

1. **选择簇的数量(k):** 首先,需要指定希望将数据分成的簇的数量。这可以是预先设定的,也可以通过一些启发式方法确定。

2. **初始化簇中心:** 随机选择k个数据点作为初始的簇中心。

3. **分配数据点:** 对于每个数据点,将其分配给离它最近的簇中心。

4. **更新簇中心:** 对于每个簇,计算其所有成员的平均值,并将该平均值作为新的簇中心。

5. **重复步骤3和步骤4:** 重复执行步骤3和步骤4,直到簇中心不再发生显著变化或达到预定的迭代次数。

K-means算法的优点包括简单易实现,计算效率高。然而,它对初始簇中心的选择敏感,并且对异常值和噪声敏感。

这个算法通常用于聚类分析,图像分割,以及其他需要将数据点划分成不同组的任务。需要注意的是,选择合适的簇数是一个关键问题,有时需要通过实验或其他方法进行调优。

MATLAB实现K-means聚类数学建模算法_第1张图片

MATLAB的代码如下:


% data set;
Sigma = [1, 0; 0, 1];
mu1 = [1, -1];
x1 = mvnrnd(mu1, Sigma, 200);
mu2 = [5, -4];
x2 = mvnrnd(mu2, Sigma, 200);
mu3 = [1, 4];
x3 = mvnrnd(mu3, Sigma, 200);
mu4 = [6, 4];
x4 = mvnrnd(mu4, Sigma, 200);
mu5 = [7, 0.0];
x5 = mvnrnd(mu5, Sigma, 200);
X = [x1; x2; x3; x4; x5];
X_label = [ones(200, 1); 2 * ones(200, 1); 3 * ones(200,1); 4 * ones(200, 1);5 * ones(200, 1)];
% Show the data points 
plot(x1(:,1), x1(:,2), 'r.'); hold on;
plot(x2(:,1), x2(:,2), 'b.');
plot(x3(:,1), x3(:,2), 'k.');
plot(x4(:,1), x4(:,2), 'g.');
plot(x5(:,1), x5(:,2), 'm.');
% select initial clustering center
m = 30;
a = max(X);
b = min(X);
k=5;
mu = zeros(k,2*m);
r = zeros(m,1);
for t=1:m
    for i=1:k
        mu(i,2*t-1:2*t)=[a(1)+(b(1)-a(1))*rand,a(2)+(b(2)-a(2))*rand];
    end
    for j = 1 : 1000
        R = repmat(X(j, :), k, 1) - mu(:,2*t-1:2*t);
        r(t) = r(t) + sum(sum(R.*R));
    end
end
p = find(r==min(r));
mu = mu(:,2*p-1:2*p);
label = zeros(1000, 1);
mu_new = mu;
eps = 1e-6;
delta = 1;
while (delta > eps)
    mu = mu_new;
    for i =1:1000
        y = repmat (X(i, :), k, 1);
        dist = y - mu;
        d = sum(dist.*dist,2);
        j = find(d==min(d));
        label(i) = j;
    end
    for j = 1 : k
        order = find(label == j);
        mu_new(j, :) = mean(X(order, :), 1);
    end
    delta = sqrt(sum(sum((mu-mu_new).*(mu-mu_new))));
end
label = zeros(1000, 1);
for i = 1 : 1000
    R = repmat(X(i,:),k,1) - mu;
    Residual = sum(R.*R,2);
    j = find(Residual == min(Residual));
    label(i) = j;
end
% Construct map function
s = zeros(k, 1);
for j =1 : k
    order = find(label==j);
    Y = X_label(order);
    s(j) = mode(Y);
end
map_label =zeros(1000, 1);
for j = 1 : k
    map_label(label==j) = s(j);
end
figure;
hold on;
for i =1:1000
    if map_label(i)==1
        plot(X(i,1),X(i,2),'r.');
    elseif map_label(i)==2
        plot(X(i,1),X(i,2),'b.');
    elseif map_label(i)==3
        plot(X(i,1),X(i,2),'k.');
    elseif map_label(i)==4
        plot(X(i,1),X(i,2),'g.');
    else
        plot(X(i,1),X(i,2),'m.');
    end
end
% show the cluster center
for i = 1 : 5
    plot(mu(i,1),mu(i,2),'yo','LineWidth',3);
end
% Calculate NMI(Normalized Mutual Information)
d = zeros(5, 1);
g = d;
sigma = zeros(5,5);
numerator = 0;
denominator1 = 0;
denominator2 = 0;
for i = 1 : 5
    d(i) = length(find(map_label==i));
    g(i) = length(find(X_label==i));
end
for i = 1 : 5 
    for j = 1 : 5
        order = find(map_label==i);
        sigma(i,j) = length(find(X_label(order)==j));
        if sigma(i,j)~=0
            numerator = numerator + sigma(i,j).*log(1000.*sigma(i,j)./(d(i).*g(j)));
        end
    end
end

for i = 1 : 5
    if d(i)~=0
        denominator1 = denominator1 + d(i).*log(d(i)/1000);
    end
    if g(i)~=0
        denominator2 = denominator2 + g(i).*log(g(i)/1000);
    end
end
denominator = sqrt(denominator1 * denominator2);
NMI = numerator/denominator;
fprintf('NMI=%.3f\n',NMI);
accuracy = sum(map_label == X_label)/1000;
fprintf('accuracy=%.3f\n',accuracy);




运行结果如下:

MATLAB实现K-means聚类数学建模算法_第2张图片 

 

你可能感兴趣的:(数学建模\MATLAB,数学建模,算法,matlab,kmeans,数据挖掘)