numpy数组ndarray中的广播broadcasting机制

broadcasting广播通常作为一个操作符,用于‘smaller’和‘larger’数组(array)间操作。当然,不光是这种情况,因为也可以用于相同大小的数组间,但是具有不同的形状(shape)。

只有当数组的形状相同或者可兼容的(compatible),数组间逐个元素(element-wise)的操作才是有效的。相同shape容易理解。那什么是可兼容呢?

为了定义两个形状是否是可兼容的,Numpy从最后开始往前逐个比较它们的维度(dimensions)大小。比较过程中,如果两者的对应维度相同,或者其中之一(或者全是)等于1,比较继续进行直到最前面的维度。否则,你将看到 ValueError错误出现(如,"operands could not be broadcast together with shapes ...")

当其中之一的形状的维度超出范围(例如,a1 的shape=(2,3,4)而a2的shape=(3,4),当a1的2超出a2范围),此时Numpy将会使用1进行比较(如将a2.shape替换为(1,3,4))直到另一个也超出dim范围。

一旦Numpy确定两者的形状是可兼容的,最终结果的形状就成了每个维度上取两者之间最大的形状尺寸。

下面是上面描述的伪代码:
Inputs: array A with m dimensions; array B with n dimensions
p = max(m, n)
if m < p:
    left-pad A's shape with 1s until it also has p dimensions
else if n < p:
    left-pad B's shape with 1s until is also has p dimensions
result_dims = new list with p elements
for i in p-1 ... 0:
    A_dim_i = A.shape[i]
    B_dim_i = B.shape[i]
    if A_dim_i != 1 and B_dim_i != 1 and A_dim_i != B_dim_i:
        raise ValueError("could not broadcast")
    else:
        result_dims[i] = max(A_dim_i, B_dim_i)

下面是一些例子

Example

--------------------------------------------------------------------------------------
(4, 3) (4, 3)
== padding ==> == result ==> (4, 3)
(3,) (1, 3)
--------------------------------------------------------------------------------------
(3,) (1, 1, 3)
== padding ==> == result ==> (5, 4, 3)
(5, 4, 3) (5, 4, 3)
--------------------------------------------------------------------------------------
(5,) (1, 1, 5)
== padding ==> ==> error (5 != 3)
(5, 4, 3) (5, 4, 3)
--------------------------------------------------------------------------------------
(5, 4, 3) (1, 5, 4, 3)
== padding ==> == result ==> (6, 5, 4, 3)
(6, 5, 4, 3) (6, 5, 4, 3)
--------------------------------------------------------------------------------------
(5, 4, 1)
== no padding needed ==> result ==> (5, 4, 3)
(5, 1, 3)
--------------------------------------------------------------------------------------

最后再举个例子
a = np.arange(5).reshape(5,1)
# a.shape == (5, 1)
b = np.arange(5)
# b.shape == (5,)
c = a + b
# b.shape 会被先扩展为(1, 5)
# 然后最终的shape将取二者中的最大值
c.shape == (5, 5)
c

 
  
array([[0, 1, 2, 3, 4],
       [1, 2, 3, 4, 5],
       [2, 3, 4, 5, 6],
       [3, 4, 5, 6, 7],
       [4, 5, 6, 7, 8]])


--------------------------------------------------------------------------------------

你可能感兴趣的:(07python,DA)