【MindSpore 入门教程】01 张量Tensor

张量Tensor

    • 定义
    • 索引
    • 运算
    • numpy转换
    • Functional 方法
    • 参考资料

深度学习中涉及的数据形式一般是标量、向量、矩阵以及高维度的。标量可以理解为0维,向量是一组一维数据,矩阵是二维数据,如黑白图像,相应的,彩色图像包含了颜色通道,是一个三维数据。常见的数据格式如图所示。

【MindSpore 入门教程】01 张量Tensor_第1张图片
MindSpore提供Tensor数据结构,来存储计算过程中使用的多维数组(n-dimensional array)。

定义

Tensor接口定义如下:

mindspore.Tensor(input_data=None, dtype=None, shape=None, init=None, internal=False)

其中,常用的参数如下:

  • input_data (Union[Tensor, float, int, bool, tuple, list, numpy.ndarray]) - 表示被存储的数据,支持基本类型如float,int, bool, 同时支持numpy对象tuple, list, np.ndarray
  • dtype (mindspore.dtype) - 用于定义该Tensor的数据类型,必须是 mindspore.dtype中定义的类型。如果该参数为None,则数据类型与 input_data 一致。
  • shape (Union[tuple, list, int]) - 用于定义该Tensor的形状。如果指定了 input_data ,则无需设置该参数。

例如,

"""
mindspore 1.8 CPU DEMO
"""
from mindspore import Tensor

# 基本类型
a = Tensor(10)
b = Tensor(3.1415926)
c = Tensor(False)

print(a)
print(b)
print(c)
print(type(c))

# output
#10
#3.1415925
#False
#

# numpy 对象
import numpy as np
import mindspore as ms

x = np.array([1, 2, 3], dtype=np.int32)
x_ms = Tensor(x)
x_ms_int64 = Tensor(x, dtype=ms.int64) # 指定类型

y_ms = Tensor([1.0, 2.0])
z_ms = Tensor([[True, False],[False, True]])

shape = (3,32,32)
shape_ms = Tensor(shape)

print(x_ms, x_ms.dype)
print(x_ms_int64, x_ms_int64.dype)
print(y_ms)
print(z_ms)
print(shape_ms)

# output
#[1 2 3] Int32
#[1 2 3] Int64
#[1. 2.]
#[[ True False]
# [False  True]]
#[ 3 32 32]

numpy输入数据最大支持维度为32,mindspore同样支持,但是会收到内存大小的限制。例如,

from mindspore import Tensor
import numpy as np
import mindspore as ms

shape = tuple([1 for i in range(32)])
print(shape)
a = np.ones(shape,dtype=np.int8)
b = Tensor(a)
print(b.shape)

# output
#(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
#(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)

当使用numpy创建数组的元素比较大时,mindspore可能会因为内存不足,分配内存失败出现报错。例如,

shape = tuple([2 for i in range(32)])
print(shape)
a = np.ones(shape,dtype=np.int32)
b = Tensor(a)
print(b.shape)

# output
(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)
[WARNING] CORE(36564,1,?):2022-7-31 16:59:1 [mindspore\core\ir\tensor.cc:75] NewData] Try to alloca a large memory, size is:17179869184
Traceback (most recent call last):
  File "tensor.py", line 33, in <module>
    b = Tensor(a)
  File "C:\Users\46638\miniconda3\envs\mindspore\lib\site-packages\mindspore\common\tensor.py", line 160, in __init__
    Tensor_.__init__(self, input_data)
MemoryError: std::bad_alloc

在指定了input_data参数后,不应该设置shape参数,否则会出现报错ValueError: If input_data is available, shape doesn't need to be set

例如,

from mindspore import Tensor
import numpy as np

shape = tuple([i+1 for i in range(4)])
print(shape)
a = np.ones(shape,dtype=np.int8)
b = Tensor(a, shape=shape)
print(b.shape)

# output
Traceback (most recent call last):
  File "tensor.py", line 39, in <module>
    b = Tensor(a, shape=shape)
  File "miniconda3\envs\mindspore\lib\site-packages\mindspore\common\tensor.py", line 127, in __init__
    _check_tensor_input(input_data, dtype, shape, init)
  File "miniconda3\envs\mindspore\lib\site-packages\mindspore\common\tensor.py", line 4438, in _check_tensor_input
    raise ValueError("If input_data is available, shape doesn't need to be set")
ValueError: If input_data is available, shape doesn't need to be set

Tensor的其他输入参数并不常用。

  • init (Initializer) - 用于在并行模式中延迟Tensor的数据的初始化,如果指定该参数,则 dtypeshape 也必须被指定。不推荐在非自动并行之外的场景下使用该接口。只有当调用 Tensor.init_data 时,才会使用指定的init来初始化Tensor数据。
  • internal (bool) - Tensor是否由框架创建。 如果为True,表示Tensor是由框架创建的,如果为False,表示Tensor是由用户创建的。默认值:False。

Tensor的属性

张量的属性包括形状、数据类型、转置张量、单个元素大小、占用总字节数量、维数、元素个数和每一维步长。

from mindspore import Tensor
import mindspore as ms
import numpy as np

x = Tensor(np.array([[1, 2], [3, 4]]), ms.int32)

print("x_shape:", x.shape)
print("x_dtype:", x.dtype)
print("x_transposed:\n", x.T)
print("x_itemsize:", x.itemsize)
print("x_nbytes:", x.nbytes)
print("x_ndim:", x.ndim)
print("x_size:", x.size)
print("x_strides:", x.strides)

# output
#x_shape: (2, 2)
#x_dtype: Int32
#x_transposed:
# [[1 3]
# [2 4]]
#x_itemsize: 4
#x_nbytes: 16
#x_ndim: 2
#x_size: 4
#x_strides: (8, 4)

索引

Tensor索引与Numpy索引类似,索引从0开始编制,负索引表示按倒序编制,冒号:...可以用于对数据进行切片。

from mindspore import Tensor
import numpy as np

tensor = Tensor(np.array([[0, 1], [2, 3]]).astype(np.float32))

print("First row: {}".format(tensor[0]))
print("value of bottom right corner: {}".format(tensor[1, 1]))
print("Last column: {}".format(tensor[:, -1]))
print("First column: {}".format(tensor[..., 0]))

# output
#First row: [0. 1.]
#value of bottom right corner: 3.0
#Last column: [1. 3.]
#First column: [0. 2.]

运算

张量之间有很多运算,包括算术、线性代数、矩阵处理(转置、标引、切片)、采样等,张量运算和NumPy的使用方式类似,下面介绍其中几种操作。

普通算术运算有:加(+)、减(-)、乘(*)、除(/)、取模(%)、整除(//)。

from mindspore import Tensor
import numpy as np

x = Tensor(np.array([1, 2, 3]), ms.float32)
y = Tensor(np.array([4, 5, 6]), ms.float32)

output_add = x + y
output_sub = x - y
output_mul = x * y
output_div = y / x
output_mod = y % x
output_floordiv = y // x

print("add:", output_add)
print("sub:", output_sub)
print("mul:", output_mul)
print("div:", output_div)
print("mod:", output_mod)
print("floordiv:", output_floordiv)

# output
#add: [5. 7. 9.]
#sub: [-3. -3. -3.]
#mul: [ 4. 10. 18.]
#div: [4.  2.5 2. ]
#mod: [0. 1. 0.]
#floordiv: [4. 2. 2.]

from mindspore import ops

data1 = Tensor(np.array([[0, 1], [2, 3]]).astype(np.float32))
data2 = Tensor(np.array([[4, 5], [6, 7]]).astype(np.float32))
op = ops.Concat()
output = op((data1, data2))

print(output)
print("shape:\n", output.shape)

# output
#[[0. 1.]
# [2. 3.]
# [4. 5.]
# [6. 7.]]
#shape:
# (4, 2)

numpy转换

Tensor可以和NumPy进行互相转换。与创建相同,使用 asnumpy() 可以将Tensor变量转换为NumPy变量。例如,

from mindspore import ops
import mindspore as ms

zeros = ops.Zeros()
output = zeros((2, 2), ms.float32)
print("output: {}".format(type(output)))

n_output = output.asnumpy()
print("n_output: {}".format(type(n_output)))

# output
output: <class 'mindspore.common.tensor.Tensor'>
n_output: <class 'numpy.ndarray'>

numpy转Tensor前面示例中有演示,使用Tensor接口直接转换。例如,

import numpy as np
from mindspore import Tensor

output = np.array([1, 0, 1, 0])
print("output: {}".format(type(output)))

t_output = Tensor(output)
print("t_output: {}".format(type(t_output)))

# output
output: <class 'numpy.ndarray'>
t_output: <class 'mindspore.common.tensor.Tensor'>

Functional 方法

Tensor 提供了一些常用的方法,例如,

  • choose(choices, mode=‘clip’)

    根据原始Tensor数组和一个索引数组构造一个新的Tensor。其中,参数mode (‘raise’, ‘wrap’, ‘clip’, optional) - 指定如何处理 [0, n-1] 外部的索引:raise - 引发异常(默认);wrap - 原值映射为对n取余后的值;clip - 大于n-1的值会被映射为n-1。该模式下禁用负数索引。

	import numpy as np
	from mindspore import Tensor
	
	choices1 = [0,2,3,1]
	choices2 = [[0,2,3,1], [30, 31, 32, 33]]
	x = Tensor(np.array([2, 3, 1, 0]))
	y1 = x.choose(choices1)
	y2 = x.choose(choices2)
	print(y1)
	print(y2)
	
	#output
	[3 1 2 0]
	[30 31 32  1]

  • clip(xmin, xmax, dtype=None)

    裁剪Tensor中的值。给定一个区间,区间外的值将被裁剪到区间边缘。 例如,如果指定的间隔为 [0,1],则小于0的值将变为0,大于1的值将变为1。

	from mindspore import Tensor
	
	x = Tensor([1, 2, 3, -4, 0, 3, 2, 0]).astype("float32")
	y = x.clip(0, 2)
	print(y)
	
	t = Tensor([1, 1, 1, 1, 1, 1, 1, 1])
	y = x.clip(t, 2)
	print(y)
	
	# output
	[1. 2. 2. 0. 0. 2. 2. 0.]
	[1. 2. 2. 1. 1. 2. 2. 1.]
  • diagonal(offset=0, axis1=0, axis2=1)

    返回Tensor指定的对角线。

	import numpy as np
	from mindspore import Tensor
	a = Tensor(np.arange(4).reshape(2, 2))
	print(a)
	
	output = a.diagonal()
	print(output)
	
	# output
	[[0 1]
	 [2 3]]
	[0 3]
  • expand_as(x)

    将目标张量的维度扩展为输入张量的维度。输出张量的维度与输入张量的相同。输出张量的维度必须遵守广播规则。其中,广播规则指输出张量的维度需要扩展为输入张量的维度,如果目标张量的维度大于输入张量的维度,则不满足广播规则。

	import numpy as np
	from mindspore import Tensor
	from mindspore import dtype as mstype
	x = Tensor([1, 2, 3], dtype=mstype.float32)
	y = Tensor(np.ones((2, 3)), dtype=mstype.float32)
	output = x.expand_as(y)
	print(output)
	
	# output
	[[1. 2. 3.]
	 [1. 2. 3.]]
	
	y = Tensor(np.ones((2, 4)), dtype=mstype.float32)
	output = x.expand_as(y)  # 报错
	
	# output
	Traceback (most recent call last):
	  File "tensor.py", line 117, in <module>
	    output = x.expand_as(y)
	  File "miniconda3\envs\mindspore\lib\site-packages\mindspore\common\tensor.py", line 960, in expand_as
	    return tensor_operator_registry.get('broadcast_to')(x.shape)(self)
	  File "miniconda3\envs\mindspore\lib\site-packages\mindspore\ops\primitive.py", line 294, in __call__
	    return _run_op(self, self.name, args)
	  File "miniconda3\envs\mindspore\lib\site-packages\mindspore\common\api.py", line 93, in wrapper
	    results = fn(*arg, **kwargs)
	  File "miniconda3\envs\mindspore\lib\site-packages\mindspore\ops\primitive.py", line 755, in _run_op
	    output = real_run_op(obj, op_name, args)
	ValueError: mindspore\core\ops\broadcast_to.cc:60 BroadcastToInferShape] For 'BroadcastTo', in order to broadcast, each dimension pair must be equal or input dimension is 1 or target dimension is -1. But got x_shape: (3), target shape: (2, 4).
  • expand_dims(axis)

    沿指定轴扩展Tensor维度。参数axis扩展维度指定的轴,int类型, 取值范围是[-self.ndim-1, self.ndim+1)

	import numpy as np
	from mindspore import Tensor
	x = Tensor(np.ones((2,2), dtype=np.float32))
	print(x)
	print(x.shape)
	print(x.ndim)
	
	y = x.expand_dims(axis=0)
	print(y)
	print(y.shape)
	
	y = x.expand_dims(axis=-3) # axis超出范围[]报错
	print(y)
	
	# output
	(2, 2)
	2
	(1, 2, 2)
	Traceback (most recent call last):
	  File "tensor.py", line 125, in <module>
	    y = x.expand_dims(axis=-4) # axis超出范围[]报错
	  File "miniconda3\envs\mindspore\lib\site-packages\mindspore\common\tensor.py", line 1859, in expand_dims
	    validator.check_int_range(axis, -self.ndim - 1, self.ndim + 1, Rel.INC_LEFT, 'axis')
	  File "miniconda3\envs\mindspore\lib\site-packages\mindspore\_checkparam.py", line 414, in check_int_range
	    return check_number_range(arg_value, lower_limit, upper_limit, rel, int, arg_name, prim_name)
	  File "miniconda3\envs\mindspore\lib\site-packages\mindspore\_checkparam.py", line 211, in check_number_range
	    prim_name, arg_name, rel_str, arg_value, type(arg_value).__name__))
	ValueError: The 'axis' must be in range of [-3, 3), but got -4 with type 'int'.
  • squeeze(axis=None)

    从Tensor中删除shape为1的维度。如果只删除shape中部分为1的维度,可使用axis 选择shape中长度为1的轴的子集。

	import numpy as np
	from mindspore import Tensor
	x = Tensor(np.ones((1,2,1,2,1), dtype=np.float32))
	print(x.shape)
	
	y = x.squeeze()
	print(y.shape)
	
	y = x.squeeze(axis=[-1,-3]) # 删除shape中指定的1
	print(y.shape)
	
	#output
	(1, 2, 1, 2, 1)
	(2, 2)
	(1, 2, 2)
  • fill(value)

    用标量值填充数组。

    NumPy不同,Tensor.fill()将始终返回一个新的Tensor,而不是填充原来的Tensor。

	import numpy as np
	from mindspore import Tensor
	a = Tensor(np.arange(4).reshape((2,2)).astype('float32'))
	print(a.fill(1.0))
	
	#output
	[[1. 1.]
	 [1. 1.]]
  • fills(value)

    创建一个与当前Tensor具有相同shape和type的Tensor,并用标量值填充。value 是填充输出Tensor的值,数据类型为int, float或0-维Tensor。该方法在CPU上暂不支持。

    同上,Tensor.fills()将始终返回一个新的Tensor,而不是填充原来的Tensor。

	import numpy as np
	from mindspore import Tensor
	a = Tensor(np.arange(4).reshape((2,2)).astype('float32'))
	print(a.fills(1.0))
	
	#output
	RuntimeError: mindspore\ccsrc\plugin\device\cpu\hal\hardware\cpu_device_context.cc:240 SetOperatorInfo] Unsupported op [Fills] on CPU, Please confirm whether the device target setting is correct, or refer to 'mindspore.ops' at https://www.mindspore.cn to query the operator support list.
  • flatten(order=‘C’)

    返回展开成一维的Tensor的副本。order (str) 仅支持’C’和’F’。’C’表示按行优先(C风格)顺序展开。’F’表示按列优先顺序(Fortran风格)进行展开。

	import numpy as np
	from mindspore import Tensor
	
	x = Tensor(np.arange(4).reshape(2, 2))
	y1 = x.flatten() # 默认'C'
	y2 = x.flatten('F')
	print(x)
	print("C: ", y1)
	print("F: ", y2)
	
	#output
	[[0 1]
	 [2 3]]
	C:  [0 1 2 3]
	F:  [0 2 1 3]
  • gather(input_indices, axis)
    根据提供的input_indicesaxis这个轴上对params进行索引,拼接成一个新的张量,其示意图如下所示:
    【MindSpore 入门教程】01 张量Tensor_第2张图片
import numpy as np
from mindspore import Tensor
import mindspore

# case1: input_indices is a Tensor with shape (5, ).
input_params = Tensor(np.array([1, 2, 3, 4, 5, 6, 7]), mindspore.float32)
input_indices = Tensor(np.array([0, 2, 4, 2, 6]), mindspore.int32)
axis = 0
output = input_params.gather(input_indices, axis)
print(output)

# case2: input_indices is a Tensor with shape (2, 2). When the input_params has one dimension,
# the output shape is equal to the input_indices shape.
input_indices = Tensor(np.array([[0, 2], [2, 6]]), mindspore.int32)
axis = 0
output = input_params.gather(input_indices, axis)
print(output)


# case3: input_indices is a Tensor with shape (2, ) and
# input_params is a Tensor with shape (3, 4) and axis is 0.
input_params = Tensor(np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]]), mindspore.float32)
input_indices = Tensor(np.array([0, 2]), mindspore.int32)
axis = 0
output = input_params.gather(input_indices, axis)
print(output)


# case4: input_indices is a Tensor with shape (2, ) and
# input_params is a Tensor with shape (3, 4) and axis is 1.
input_params = Tensor(np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]]), mindspore.float32)
input_indices = Tensor(np.array([0, 2]), mindspore.int32)
axis = 1
output = input_params.gather(input_indices, axis)
print(output)

#output
#[1. 3. 5. 3. 7.]
#[[1. 3.]
# [3. 7.]]
#[[ 1.  2.  3.  4.]
# [ 9. 10. 11. 12.]]
#[[ 1.  3.]
# [ 5.  7.]
# [ 9. 11.]]
  • gather_nd(indices)
    gather_nd类似于gather,不过后者只能在一个维度上进行索引,而前者可以在多个维度上进行索引。
import numpy as np
from mindspore import Tensor
import mindspore

input_x = Tensor(np.array([[-0.1, 0.3, 3.6], [0.4, 0.5, -3.2]]), mindspore.float32)
indices = Tensor(np.array([[0, 0], [1, 1]]), mindspore.int32)
output = input_x.gather_nd(indices)
print(output)

#output
#[-0.1  0.5]
indices = Tensor(np.array([[0, 0], [2, 4]]), mindspore.int32) # 4 超过输入Tensor在axis=1轴上的范围
output = input_x.gather_nd(indices)
print(output)

#output
#[-1.0000000e-01  1.5307091e+12] 第二个值随机
  • isclose(x2, rtol=1e-05, atol=1e-08, equal_nan=False)

返回一个布尔型Tensor,表示当前Tensor与 x2 的对应元素的差异是否在容忍度内相等,类似numpy.allclose()

import numpy as np
from mindspore import Tensor
import mindspore

input = Tensor(np.array([1.3, 2.1, 3.2, 4.1, 5.1]), mindspore.float16)
other = Tensor(np.array([1.3, 3.3, 2.3, 3.1, 5.1]), mindspore.float16)
output = input.isclose(other)
print(output)

#output
#[ True False False False  True]
  • norm(axis, p=2, keep_dims=False, epsilon=1e-12)

    根据指定的计算范数的维度axis,计算给定Tensor的矩阵或向量的p范数。p范数公式如下,

∣ ∣ x ∣ ∣ p = ( ∑ i , j ∣ x i j ∣ p ) 1 p ||x||_p=(\sum_{i,j}|x_{ij}|^p)^{\frac{1}{p}} ∣∣xp=(i,jxijp)p1

	import numpy as np
	from mindspore import Tensor
	import mindspore
	
	input_x = Tensor(np.array([[[1.0, 2.0], [3.0, 4.0]], 
							   [[5.0, 6.0], [7.0, 8.0]]]).astype(np.float32))
	output = input_x.norm([0, 1], p=2) # 指定轴 0 和 1
	print(input_x.shape)
	print(output)
	
	#output
	#(2, 2, 2)
	#[ 9.165152 10.954452]
  • sum(axis=None, dtype=None, keepdims=False, initial=None)

    返回指定维度axis上的所有元素的总和。

	import numpy as np
	from mindspore import Tensor
	input_x = Tensor(np.array([-1, 0, 1]).astype(np.float32))
	print(input_x.sum())
	
	input_x = Tensor(np.arange(10).reshape(2, 5).astype(np.float32))
	print(input_x)
	print(input_x.sum(axis=1))
	
	#output
	#0.0
	#[[0. 1. 2. 3. 4.]
	# [5. 6. 7. 8. 9.]]
	#[10. 35.]
  • reshape(*shape)

    将Tensor的shape改为输入的新shape, 不改变原来的数据。

	from mindspore import Tensor
	from mindspore import dtype as mstype
	x = Tensor([[-0.1, 0.3, 3.6], [0.4, 0.5, -3.2]], dtype=mstype.float32)
	output = x.reshape((3, 2))
	print(x)
	print(output)
	
	#output
	"""
	[[-0.1  0.3  3.6]
	 [ 0.4  0.5 -3.2]]
	[[-0.1  0.3]
	 [ 3.6  0.4]
	 [ 0.5 -3.2]]
	"""

	output = x.reshape([3, 2]) # 输入shape支持list
	output = x.reshape(3, 2) # 输入shape支持多个整数
  • view(*shape)

    根据输入shape重新创建一个Tensor,与原Tensor数据相同。该方法与reshape方法相同。

	from mindspore import Tensor
	import numpy as np
	
	a = Tensor(np.array([[1, 2, 3], [2, 3, 4]], dtype=np.float32))
	output = a.view((3, 2)) 
	print(a)
	print(output)
	
	#output
	"""
	[[1. 2. 3.]
	 [2. 3. 4.]]
	[[1. 2.]
	 [3. 2.]
	 [3. 4.]]
	"""
	
	output = a.view([3, 2]) # 输入shape不支持list
	#output
	"""
	TypeError: For 'Reshape', the type of 'shape[0]' should be 'int', but got '[3, 2]' with type 'list'.
	"""
	output = a.view(3, 2) # 输入shape支持多个整数

更多内容请参考官网说明mindspore.Tensor

参考资料

  • MindSpore.Tensor

  • tf.gather, tf.gather_nd 和 tf.slice

  • PyTorch Tensor

  • p范数

你可能感兴趣的:(MindSpore,入门教程,python,机器学习,MindSpore,Tensor)