很多时候需要给高维数组变形,以达到需要的格式,但很多时候,可能变形后的结果并不是你所预想的那样,我是在看一段pytorch的代码时引发的思考:
prediction = prediction.view(batch_size, bbox_attrs*num_anchors, grid_size*grid_size)
prediction = prediction.transpose(1,2).contiguous()
prediction = prediction.view(batch_size, grid_size*grid_size*num_anchors, bbox_attrs)
这么“曲折”的变形过程说明一步到位是不可能的,但是为什么呢?
在查找的过程中,stackoverflow上有个回答很好:
How does NumPy’s transpose() method permute the axes of an array?
其实,无论几维,在内存中都是连续的一维空间,高维保存下的不过是stride而已,即下一次要取的元素的跨度是多少,而transpose改变的自然也只是stride
举例说明:
a=np.arange(1,17)
a
Out[10]: array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16])
a=a.reshape(2,2,4)
a
Out[12]:
array([[[ 1, 2, 3, 4],
[ 5, 6, 7, 8]],
[[ 9, 10, 11, 12],
[13, 14, 15, 16]]])
此时a.shape=(2, 2, 4),对应的stride为 2*4, 4, 1
即8,4,1, (注:当然,实际是以byte为单位,即int为8byte,那么步长就变为(64,32,8)
"""模拟实现 多维数组的输出"""
memory = np.arange(0,17)
# a见上面的代码
i_index = j_index = k_index = 0
for i in range(a.shape[0]):
i_index = i*strides[0]
for j in range(a.shape[1]):
j_index =i_index + j*strides[1]
for k in range(a.shape[2]):
k_index = j_index + k*strides[2]
print(memory[k_index],end=' ')
print()
""" 输出结果
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
"""
那么当transpose(1,0,2)时,只发生了下列参数的变换:
a.shape=(2, 2, 4) #这儿第一二个参数已经交换了位置
a.strides=(4, 8, 1)
i_index = j_index = k_index = 0
for i in range(a.shape[1]):
i_index = i*strides[1]
for j in range(a.shape[0]):
j_index =i_index + j*strides[0]
for k in range(a.shape[2]):
k_index = j_index + k*strides[2]
print(memory[k_index],end=' ')
print()
"""
1 2 3 4
9 10 11 12
5 6 7 8
13 14 15 16
"""
接下来说到pytorch的view,正是由于数组存放是连续的内存块,所以
若a.shape=(2,3, 4),你想将它变成(3, 8), 假设赋予其意义,有两个样本,3个class,4个像素,如今想要每行为一个class,一行中有8个值,分别代表第一个样本的4个像素和第二个样本的四个像素。
a=np.arange(1,25)
a=a.reshape(2,3,4)
a
Out[39]:
array([[[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]],
[[13, 14, 15, 16],
[17, 18, 19, 20],
[21, 22, 23, 24]]])
那么,(3,8)我们希望的结果是这样的:
array([[1, 2, 3, 4, 13, 14, 15, 16],
[ 5, 6, 7, 8, 17, 18, 19, 20],
[ 9, 10, 11, 12, 21, 22, 23, 24]])
但是直接view,结果是这样的:
a_tensor = torch.from_numpy(a)
a_view=a_tensor.view(3,-1)
a_view
Out[42]:
tensor([[ 1, 2, 3, 4, 5, 6, 7, 8],
[ 9, 10, 11, 12, 13, 14, 15, 16],
[17, 18, 19, 20, 21, 22, 23, 24]])
因为view只是简单的切分连续的内存地址,
若想达到我们想要的效果,需要先做transpose,如下:
a_trans = a_tensor.transpose(1,0)
# 这样会报错,因为很多操作都不是改变原数组的,如果报错了,就按提示使用contiguous,
# 使用后才会真正复制一份
a_correct = a_trans.view(3,-1)
Traceback (most recent call last):
File "" , line 1, in <module>
a_correct = a_trans.view(3,-1)
RuntimeError: invalid argument 2: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Call .contiguous() before .view(). at /Users/soumith/mc3build/conda-bld/pytorch_1549593514549/work/aten/src/TH/generic/THTensor.cpp:213
更改后:
a_correct = a_trans.contiguous().view(3,-1)
a_correct
Out[47]:
tensor([[ 1, 2, 3, 4, 13, 14, 15, 16],
[ 5, 6, 7, 8, 17, 18, 19, 20],
[ 9, 10, 11, 12, 21, 22, 23, 24]])
可以看见,已经达到了我们想要的结果。