numpy深入理解(1) indexing: view or copy

View ? Copy?

对View这个概念感到困惑吗,不妨搞搞清楚~

先看numpy 官方定义的view

An array that does not own its data, but refers to another array’s data instead. For example, we may create a view that only shows every second element of another array

官方定义就是清晰!

所以说,View也是一个array,但是这个array是建立在别人的数据上的,并没有独立的memory存放data。(通过共享数据,可以在一些场合节省无意义的memory copy开销。)请看下面例子,y就是建立在x的data上的一个view.

>>> x = np.arange(5)
>>> x
array([0, 1, 2, 3, 4])

>>> y = x[::2]
>>> y
array([0, 2, 4])

>>> x[0] = 3 # changing x changes y as well, since y is a view on x
>>> y
array([3, 2, 4])

显然,上述代码中x和y共享了一段data。但是,我们能不能把所有的array[index]都看做是array的view呢?请看下面代码

>>> x = np.arange(5)
>>> y_copy = x[[0, 2, 4]] # index is a list, not a tuple!
>>> y_copy
array([0, 2, 4])
>>> x[0] = 3
>>> x
array([3, 1, 2, 3, 4])
>>> y_copy
array([0, 2, 4])         # x[0] changed, y[0] not affected!

是不是有点儿奇怪呢,看来还需要透彻理解一下numpy的indexing功能~

Indexing

ndarray_object [index] ,如果作为“右值”,会触发

ndarray_object.__getitem__(index)

这个函数的逻辑根据index的类型不同,会有很大的区别:

  • simple indexing
    • single element indexing (view)
    • slice indexing (view)
  • Advanced indexing
    • integer array indexing (copy)
    • bool array indexing (copy)

ndarray_object [index] 的返回值,从array data memory layout的角度,有view和copy的区别。

让我们看几个实例,结合问题加深理解。

假定已经有

import numpy as np

Q1 生成矩阵Mij,矩阵长相如下
[0, 1, 0, 1, 0]
[1, 0, 1, 0, 1]
[0, 1, 0, 1, 0]
[1, 0, 1, 0, 1]
[0, 1, 0, 1, 0]

下面这种方法(并非最佳),先生成5*5的zero矩阵,然后在合适的位置上填1。用index slice的方法,可以容易的找出填1的位置。

>> matrix = np.zeros((5, 5),dtype=int)
array([[0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0]])
>> matrix[0::2, 1::2] = 1
>> matrix[1::2, 0::2] = 1
>> matrix
array([[0, 1, 0, 1, 0],
       [1, 0, 1, 0, 1],
       [0, 1, 0, 1, 0],
       [1, 0, 1, 0, 1],
       [0, 1, 0, 1, 0]])

Q2 用一行定义矩阵的“侧转置”,“侧转置”定义如下,输入时
[0, 1, 2]
[3, 4, 5]
[6, 7, 8]
“侧转置”后,得到
[8, 7, 6]
[5, 4, 3]
[2, 1, 0]

(array indexing; hint: slicing using -1)

Q3 Let x be the following 2-D array with shape (3, 3)
[0, 1, 2]
[3, 4, 5]
[6, 7, 8]
what is x[0, 1]?
what is x[(0, 1)]?
what is x[[0, 1]]?
what is x[[0, 1], :]?

(array indexing: single element indexing; index arrays)

Q4 Let x be the following 2-D array with shape (3, 3)
[0, 1, 2]
[3, 4, 5]
[6, 7, 8]

# what is x after executing the following code
    y = x[[2]]  # Not x[(2)]!
    y[:] = 0
# what is x after executing the following code
    y = x[2]    # same as x[(2)]
    y[:] = 0

(array indexing VIEW & COPY)

Q5 Let x be the following 2-D array with shape (3, 3)
[0, 1, 2]
[3, 4, 5]
[6, 7, 8]

# define "indexing by slice" and "indexing by array"
idx_slice = slice(0, 3, 2)
idx_array = [0, 2]
# what is x after executing the following code
x[idx_slice, :][:, idx_array] = 0
# what is x after executing the following code
x[idx_array, :][:, idx_slice] = 0

Q5 这一题是难度最大的,涉及到对getitem(), setitem(), numpy index by slice & by array的深入理解。这里有篇文章ViewsVsCopies,讲的很好。

Reference Solution

# Solution of Q1 By GongPing @2016-10-26
# more elegent solution
matrix = np.fromfunc(lambda i,j: (i+j)%2, (5, 5), dtype=int)

# yet another solution
matrix = np.arange(25).reshape(5, 5) % 2

参考文档:numpy.fromfunction

# Solution of Q2 By GongPing @2016-10-26
side_transpose = lambda a: a[::-1, ::-1]

Q3, Q4, Q5的结果大家可以很容易的试出来,建议动手try一下。

如果要透彻理解的话,建议通读一下官方quick start教程

你可能感兴趣的:(python笔记)