聚类分析算法概述_聚类算法-概述

聚类分析算法概述_聚类算法-概述_第1张图片

聚类分析算法概述

聚类算法-概述 (Clustering Algorithms - Overview)

集群介绍 (Introduction to Clustering)

Clustering methods are one of the most useful unsupervised ML methods. These methods are used to find similarity as well as the relationship patterns among data samples and then cluster those samples into groups having similarity based on features.

聚类方法是最有用的无监督ML方法之一。 这些方法用于查找数据样本之间的相似性以及关系模式,然后基于特征将这些样本聚类为具有相似性的组。

Clustering is important because it determines the intrinsic grouping among the present unlabeled data. They basically make some assumptions about data points to constitute their similarity. Each assumption will construct different but equally valid clusters.

聚类很重要,因为它决定了当前未标记数据之间的固有分组。 他们基本上对数据点进行一些假设以构成它们的相似性。 每个假设将构建不同但有效的集群。

For example, below is the diagram which shows clustering system grouped together the similar kind of data in different clusters −

例如,以下是显示集群系统的图,该集群系统将不同集群中的同类数据分组在一起-

聚类分析算法概述_聚类算法-概述_第2张图片

团簇形成方法 (Cluster Formation Methods)

It is not necessary that clusters will be formed in spherical form. Followings are some other cluster formation methods −

簇不必形成球形。 以下是其他一些集群形成方法-

基于密度 (Density-based)

In these methods, the clusters are formed as the dense region. The advantage of these methods is that they have good accuracy as well as good ability to merge two clusters. Ex. Density-Based Spatial Clustering of Applications with Noise (DBSCAN), Ordering Points to identify Clustering structure (OPTICS) etc.

在这些方法中,簇形成为密集区域。 这些方法的优点是它们具有良好的准确性以及合并两个聚类的良好能力。 例如 带噪声的应用程序的基于密度的空间聚类(DBSCAN),识别聚类结构的订购点(OPTICS)等。

基于层次的 (Hierarchical-based)

In these methods, the clusters are formed as a tree type structure based on the hierarchy. They have two categories namely, Agglomerative (Bottom up approach) and Divisive (Top down approach). Ex. Clustering using Representatives (CURE), Balanced iterative Reducing Clustering using Hierarch

你可能感兴趣的:(聚类,算法,数据库,python,机器学习)