whiten矩阵时报错:TypeError: cannot perform reduce with flexible type

用kmeans聚类之前需要对数据进行归一化处理,我的矩阵是从数据库中获取的,之后还经过一次转换,使用whiten对矩阵进行归一化的时候,出现错误:

Traceback (most recent call last):

  File "", line 1, in

  File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/scipy/cluster/vq.py", line 133, in whiten

    std_dev = std(obs, axis=0)

  File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/core/fromnumeric.py", line 2817, in std

    keepdims=keepdims)

  File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/core/_methods.py", line 116, in _std

    keepdims=keepdims)

  File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/core/_methods.py", line 86, in _var

    arrmean = um.add.reduce(arr, axis=axis, dtype=dtype, keepdims=True)

TypeError: cannot perform reduce with flexible type


报错的主要原因是最后一句:TypeError: cannot perform reduce with flexible type

我打印了很多次矩阵发现我的矩阵没有dtype,打印出来是这样的:

array([['1', '25', '1', ..., '0', '1', '9011.92'],

       ['0', '28', '0', ..., '0', '0', '2400'],

       ['1', '34', '1', ..., '0', '1', '1.97'],

       ..., 

       ['1', '31', '1', ..., '0', '1', '0'],

       ['0', '23', '1', ..., '0', '1', '3700'],

       ['0', '41', '1', ..., '0', '1', '3700']], 

      dtype='|S9')

于是我猜测应该是dtype的问题,然后查了一下dtype,看到这篇文章后茅塞顿开:

http://www.cnblogs.com/hhh5460/p/5129032.html

原来用astype可以改变矩阵的dtype类型,于是我试着把我的矩阵的dtype改一下:

points = points.astype(float)

打印出来全是浮点数:

>>> print points

[[  1.00000000e+00   2.50000000e+01   1.00000000e+00 ...,   0.00000000e+00

    1.00000000e+00   9.01192000e+03]

 [  0.00000000e+00   2.80000000e+01   0.00000000e+00 ...,   0.00000000e+00

    0.00000000e+00   2.40000000e+03]

 [  1.00000000e+00   3.40000000e+01   1.00000000e+00 ...,   0.00000000e+00

    1.00000000e+00   1.97000000e+00]

 ..., 

 [  1.00000000e+00   3.10000000e+01   1.00000000e+00 ...,   0.00000000e+00

    1.00000000e+00   0.00000000e+00]

 [  0.00000000e+00   2.30000000e+01   1.00000000e+00 ...,   0.00000000e+00

    1.00000000e+00   3.70000000e+03]

 [  0.00000000e+00   4.10000000e+01   1.00000000e+00 ...,   0.00000000e+00

    1.00000000e+00   3.70000000e+03]]


再用whiten函数归一化,就没报错了:

>>> whiten(points)

array([[  2.12715108e+00,   2.19163343e+00,   1.25619815e+00, ...,

          0.00000000e+00,   1.25619815e+00,   7.88747918e-01],

       [  0.00000000e+00,   2.45462944e+00,   0.00000000e+00, ...,

          0.00000000e+00,   0.00000000e+00,   2.10054573e-01],

       [  2.12715108e+00,   2.98062147e+00,   1.25619815e+00, ...,

          0.00000000e+00,   1.25619815e+00,   1.72419795e-04],

       ..., 

       [  2.12715108e+00,   2.71762546e+00,   1.25619815e+00, ...,

          0.00000000e+00,   1.25619815e+00,   0.00000000e+00],

       [  0.00000000e+00,   2.01630276e+00,   1.25619815e+00, ...,

          0.00000000e+00,   1.25619815e+00,   3.23834133e-01],

       [  0.00000000e+00,   3.59427883e+00,   1.25619815e+00, ...,

          0.00000000e+00,   1.25619815e+00,   3.23834133e-01]])





完美~~~





你可能感兴趣的:(python,机器学习)