DIANA算法c++实现

 第一步对具有最大直径的簇中每个点计算平均相异度找出最大的点放入splinter group,其余放在放入splinter group

第二步 在old party里找出到splinter group中点的最近距离 <= 到old party中点的最近距离的点,并将该点加入splinter group

重复第二步的工作,直到没有新的old party中的点分配给splinter group且满足分裂的簇数k,如果没有到达终止条件,则从分裂好的簇中选一个最大的簇按刚才的分裂方法继续分裂

#include 
#include 
#include 
#include 
#include 
using namespace std;

struct Point
{
    double x;
    double y;
};


double distance(const Point& a, const Point& b)
{
    return sqrt(pow(a.x - b.x, 2) + pow(a.y - b.y, 2));
}

vector findmaxC(vector>& clusters)
{
    double max = 0;
    int index = 0;
    for (int i =0;i< clusters.size(); i++)
    {
        double max1 = 0;
        for (int j = 0; j < clusters[i].size(); j++)
        {
            for (int k = j + 1; k < clusters[i].size(); k++)
            {
                double dis = distance(clusters[i][j], clusters[i][k]);
               
                if (max < dis)
                {
                    max1 = dis;
                }
                
            }
        }
        if (max1 > max)
        {
            max1 = max;
            index = i;
        }
    }
    return clusters[index];
}
int findMaxD(vector cluster)
{
    double max = 0;
    int maxIndex = 0;
    for (int i = 0; i < cluster.size(); i++)
    {
        double avg = 0;
        double dis = 0;
        for (int j = 0; j < cluster.size(); j++)
        {
            if (i != j)
            {
                dis += distance(cluster[i], cluster[j]);
            }
        }
        avg = dis / (cluster.size() - 1);
        if (avg > max)
        {
            max = avg;
            maxIndex = i;
        }
    }
    return maxIndex;
}

void spilt(vector& cluster, int k,int maxIndex)
{
    int flag = 0;
    vector splinter;
    vector oldParty;
    vector> sum;
  
    splinter.push_back(cluster[maxIndex]);// splinter.push_back[maxIndex]   +1?
    for (int i = 0; i < cluster.size(); i++)
    {
        if (i != maxIndex)
        {
            oldParty.push_back(cluster[i]); //oldParty.push_back i
        }
    }

    sum.push_back(splinter);
    sum.push_back(oldParty);

    while (sum.size() <= k && flag == 0)
    {
        double min = numeric_limits::max();
        int in = 0;
        for (int i = 0; i < sum[1].size(); i++)
        {
            for (int j = 0; j < sum[1].size(); j++)
            {
                if (j != i)
                {
                    double d = distance(sum[1][i], sum[1][j]);
                    if (d < min)
                    {
                        min = d;
                        in = i;                 //min
                    }
                }
            }
             cout << min << endl;
             double min1 = numeric_limits::max();
             int in1 = 0;
             for (int m = 0;m < sum[0].size(); m++)
             {
                 double ds = distance(sum[0][m], sum[1][in]);
                 if (ds < min1)
                 {
                        min1 = ds;
                            //in1 = i;            
                  }
              }
             cout << min1< temp=findmaxC(sum);
        int i=findMaxD(temp);
        vector re;
        re.push_back(temp[i]);
        sum.push_back(re);
        //未完
    }
    for (int i = 0; i < sum.size(); i++)
    {
        cout << "{";
        for (int j = 0; j < sum[i].size(); j++)
        {
            cout << "(" << sum[i][j].x << ", " << sum[i][j].y << ")" <<",";
        }
        cout<<"}" << endl;
   }

}

vector ReadData(string filename)
{
    vector data;
    ifstream file(filename);
    if (file.is_open())
    {
        string line;
        while (getline(file, line))
        {
            istringstream iss(line);
            double x, y;
            string token;
            Point point;
            if (getline(iss, token, ',') && istringstream(token) >> point.x &&
                getline(iss, token, ',') && istringstream(token) >> point.y) {
                data.push_back(point);
            }
        }
    }
    else
    {
        cout << "open fail";
    }
    file.close();
    return data;
}

int main()
{
    vector dataset = ReadData("data1.txt");
    int index=findMaxD(dataset);
    int k;
    cout << "终止条件为几个簇" << endl;
    cin >> k;
    spilt(dataset, k, index);

}
//{ {1.0, 1.0}, {1.0, 2.0}, {2.0, 1.0}, {2.0, 2.0}, {3.0, 4.0}, {3.0, 5.0}, {4.0, .0}, {4.0, 5.0} };

 data1:

DIANA算法c++实现_第1张图片

data2:

 DIANA算法c++实现_第2张图片

你可能感兴趣的:(数据挖掘,c++,算法)