scipy.sparse.hastack blocks must be 2-D

采用scipy.parse.hstack() 合并2个表的时候报错。

事实证明这是个大坑。

 

在stackoverflow上搜到这个回答。

def hstack(blocks ...):
    return bmat([blocks], ...)

def bmat(blocks, ...):
    blocks = np.asarray(blocks, dtype='object')
    if blocks.ndim != 2:
        raise ValueError('blocks must be 2-D')
    (continue)

# @hpaulj
# https://stackoverflow.com/questions/31900567/scipy-sparse-hstack1-2-valueerror-blocks-must-be-2-d-why

可以看到,先将传入的参数转化为np.ndarray,然后判断是不是2维。

 

做测试:

import numpy as np
from scipy.sparse import coo_matrix, hstack

aa = np.array([[4],[5],[6]])
ba = np.array( [[1],[2],[3]])
print(aa.shape)    #(3,1)
print(bb.shape)    #(3,1)

A = scipy.sparse.coo_matrix(aa)
B = scipy.sparse.coo_matrix(bb)
print(A.shape)    #(3,1)
print(B.shape)    #(3,1)


#转换成scipy.sparse.coo_matrix之后可以正常合并
C=hstack([A,B])
print(C.shape)    #(3,2)

#使用原生的numpy.ndarray就会报错
c=hstack([aa,bb])
#raise('blocks must be 2-D')

 

这不是大坑是什么??

凭什么np.ndarray就报错啊!!!

 

我们还原一下

#在hstack函数内,先将传入的参数转换成np.ndarray
blocks = [aa,bb]
blocks = np.asarray(blocks, dtype='object')

#打印看看
print(blocks.shape)

#输出
#(2, 3, 1),这不是变成纵向排列了吗!

print(blocks.ndim)
#输出3,所以被判定为不是2-D矩阵

if blocks.ndim != 2:
    raise ValueError('blocks must be 2-D')

 

所以说来说去,还是numpy自己的函数np.asarray()写的不好。

搞sparse.coo_matrix的时候,[A,B]被横向叠加。

自np.ndarray的时候,[aa,bb]被纵向叠加。

 

 

综上,为了解决报错

建议在使用前都先转换成sparse.coo_matrix

import numpy as np
from scipy.sparse import coo_matrix, hstack

A = scipy.sparse.coo_matrix(aa)
B = scipy.sparse.coo_matrix(bb)

C = hstack([A,B])
#这样就不会出错了

 

 

但是这样转换很麻烦诶(台湾腔)!!

那怎么办呢!!

 

老爹说要用魔法打败魔法!

numpy的事情交给numpy对付!

 

numpy.hstack(tup)[source]

This is equivalent to concatenation along the second axis, except for 1-D arrays where it concatenates along the first axis. Rebuilds arrays divided by hsplit.

tup : sequence of ndarrays

        The arrays must have the same shape along all but the second axis, except 1-D arrays which can be any length.

 示例:

import numpy as np
import scipy
from scipy.sparse import coo_matrix, hstack

aa = np.array([[4],[5],[6]])
bb = np.array( [[1],[2],[3]])
print(aa.shape)    #(3,1)
print(bb.shape)    #(3,1)

#要用魔法打败魔法!
c=np.hstack([aa,bb])
print(c.shape)
#(3,2)


#最后转回sparse矩阵
cc = scipy.sparse.coo_matrix(c)


#-----------------------------------
A = scipy.sparse.coo_matrix(aa)
B = scipy.sparse.coo_matrix(bb)
print(A.shape)
print(B.shape)


C= scpipy.sparse.hstack([A,B])
print(C.shape)

 

你可能感兴趣的:(python)