机器学习聚类测试数据生成

机器学习聚类测试数据生成

使用 s k l e a r n . d a t a s e t s . m a k e b l o b s sklearn.datasets.makeblobs sklearn.datasets.makeblobs

使用方法
from sklearn.datasets import makeblobs

makeblobs(n_samples=100, 
          n_features=2, 
          centers=None, 
          cluster_std=1.0, 
          center_box=(-10.0, 10.0), 
          shuffle=True, 
          random_state=None
         )

内容收集自sklearn网站

参数解释

n_samples:数据总大小

n_features:特征数量(数据维度)

centers:聚类中心数量

cluster_std:聚类的标准差

center_box:The bounding box for each cluster center when centers are generated at random.

shuffle:(bool)Shuffle the samples.

random_state:随机种子

返回两个值x,y,分别是属性和标签

from sklearn.datasets import make_blobs

x, y = make_blobs(n_samples=10,
                  n_features=3,
                  centers=3,
                  cluster_std=1.0,
                  center_box=(0, 10),
                  random_state=None)
print(x, y)
'''
x:
[[ 1.84867966  2.50972065  8.81261611]
 [ 4.41642014  5.25986382  8.6340203 ]
 [ 3.03900151  2.78507042  3.41420868]
 [ 1.06376153  4.22600614  1.25715659]
 [ 2.62402063  3.47719094  9.10799865]
 [ 2.65567624  3.31822041  3.76622607]
 [ 5.58539182  6.07267481  8.83147447]
 [ 5.63875173  5.57249052 10.6567555 ]
 [ 5.42795437  4.75499218  8.91883699]
 [ 2.78790159  4.85890851  9.20342921]]
y:
[2 0 1 1 2 1 0 0 0 2]
'''

你可能感兴趣的:(python,数学建模)