一、张量简介

张量（Tensor）是各种深度学习库中最核心的概念之一。PyTorch中的张量与我们熟知的Numpy的数组（Array）几乎是一样的，它不需要知道深度学习，不需要知道计算图、不需要知道梯度（而这些概念都是与张量息息相关的）。张量只是一个可以使用任何数学运算的n维数组。PyTorch中的张量与Numpy的数组的最大区别就是：张量可以跑在CPU上，也可以跑在GPU上。而Numpy数组只能跑在CPU上[1]。同一个张量中的元素必须是同一种固定数据类型。因为张量在CPU和GPU上的元素数据类型是不同的，因此Torch定义了9中不同的CPU张量类型和9种GPU张量类型。具体如图1所示[2]。

图1 PyTorch中张量元素的数据类型

其中torch.Tensor类默认的张量类型为：torch.FloatTensor。

二、张量操作

张量操作内容很多，如果要看完所有操作并且记下来需要花很长时间。实际上我们不用看完所有操作，一来浪费时间，二来我们实际是可以随时查看文档的，又不是闭卷考试。因此更重要的是我们要知道张量都有什么相关的函数功能。和Numpy数组对比起来学习是很有用的，毕竟PyTorch张量和Numpy数组是非常相似的。下面的几部分实际都只是部分张量的相关操作，所有函数及使用可以查看官方文档[3]，分成下面的几块来讲解也是希望可以理清一个结构，方便理解。

import numpy as np
import torch

2.1 张量基础函数

`torch.is_tensor(obj)`

判断对象obj是否是一个张量，返回True或False。

a = [1, 2]
b = torch.tensor(a)
print(torch.is_tensor(a))  # False
print(torch.is_tensor(b))  # True

`torch.numel(input) → int`

返回张量input的元素总个数

a = torch.rand(1, 2, 3, 4)
print(torch.numel(a))  # 24
b = torch.ones(3, 3)
print(torch.numel(b))  # 9

2.2 创建张量

上一节我们实际已经通过一些方法创建了张量。这一节详细介绍一些重点的创建张量的函数。

`torch.tensor(data, dtype=None, device=None, requires_grad=False, pin_memory=False) → Tensor`

使用tensor的构造函数构造张量。
参数说明

data：用于初始化张量数据。可以是Python的list、tuple、Numpy的ndarray、标量或者其他能用来表示数据的类型。
dtype：期望返回的张量的数据类型。可以是图1中所示的各种dtype。
device：期望使用CPU还是GPU，类型为torch.device。
requires_grad：bool类型，决定是否对该张量求导，默认是False。（注意只有元素是float类型的张量可以设置为True，因为求导操作是需要数据为浮点数的）。
pin_memory：决定改张量是否分配在内存固定页（好处是不会被swap出去，确保该内存始终驻留在物理内存中，并且通过DMA可以更快的完成数据访问[4]
）。因为这个参数是和CUDA相关的，因此如果要设定为True，需要有GPU。默认是False。如果没有GPU但是设置该参数为True，则会报RuntimeError: Pinned memory requires CUDA.错误。

print(torch.tensor([1, 2, 3]))
print(torch.tensor([[1, 2, 3], [4, 5, 6]]))
print(torch.tensor([1, 2, 3], dtype=torch.float64))
print(torch.tensor([1, 2, 3], dtype=torch.float64, requires_grad=True))

# 输出：
tensor([1, 2, 3])
tensor([[1, 2, 3],
        [4, 5, 6]])
tensor([1., 2., 3.], dtype=torch.float64)
tensor([1., 2., 3.], dtype=torch.float64, requires_grad=True)

`torch.as_tensor(data, dtype=None, device=None) → Tensor`

将data数据转换成张量并返回。
参数说明：

data：用于初始化张量数据。可以是Python的list、tuple、Numpy的ndarray、标量或者其他能用来表示数据的类型。
dtype：期望返回的张量的数据类型。可以是图1中所示的各种dtype。
device：期望使用CPU还是GPU，类型为torch.device。
【注意】：这个函数在不同情况下可能是创建一个新的对象（深拷贝），也可能是与data数据共享内存空间（浅拷贝）。存在以下情况：

如果data是tensor，并且torch.as_tensor中的dtype和device 两个参数与data相同，则不复制新的对象出来；只要dtype和device中有任何一个与data不同，则会复制新的对象
如果data是Numpy的ndarray，也是只要dtype和device参数都没有特殊修改，就不复制新的对象，否则会复制新的对象。

a = np.array([1, 2, 3])
t = torch.as_tensor(a)
print(t)
a[0] = 100
print(t)
t[1] = 50
print(a)

# 输出
tensor([1, 2, 3], dtype=torch.int32)
tensor([100,   2,   3], dtype=torch.int32)
[100  50   3]

上面的代码表明为浅拷贝。

a = np.array([1, 2, 3])
t = torch.as_tensor(a, dtype=torch.float32)
print(t)
a[0] = 100
print(t)

# 输出
tensor([1., 2., 3.])
tensor([1., 2., 3.])

上面的代码表明当dtype特殊设置时为深拷贝。因为当a的部分值改变时，t并没有受到影响。

t1 = torch.tensor([1, 2, 3])
t2 = torch.as_tensor(t1)
t1[0] = 100
print(t2)

t1 = torch.tensor([1, 2, 3], dtype=torch.float32, requires_grad=True)
t2 = torch.as_tensor(t1)
t1[0] = 100
print(t2)
print(t2.requires_grad)

# 输出
tensor([100,   2,   3])
tensor([100.,   2.,   3.], grad_fn=)
True

上面的代码都是浅拷贝的示例。

`torch.from_numpy(ndarray) → Tensor`

该函数用于从Numpy数据中创建张量。注意返回的张量与Numpy数据共享数据的内存空间。当修改张量是，Numpy数组中的值也会改变。

a = np.array([1, 2, 3])
t = torch.from_numpy(a)
print(t)  # tensor([1, 2, 3], dtype=torch.int32)
a[0] = 100
print(t)  # tensor([100,   2,   3], dtype=torch.int32)

`torch.zeros(*size, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor`

根据size定义的形状，返回一个全0张量。
部分参数说明

size：为一个整型序列，表示需要生成的张量的维度。可以是Python的list或tuple。
out：可以将需要返回的张量输出到out对应的tensor中，可以不填。
layout：参数为torch.layout类型。默认为torch.strided。这个参数反映的是张量在内存中的内存布局，就是张量在内存中是怎么存储的。目前主要就是torch.strided，另外还支持一种torch.sparse_coo[5]。所以其实这个参数一般不用设置。
其余参数和之前的函数说明一致，这里就不具体介绍了。

a = torch.zeros(2, 3)
print(a)
b = torch.zeros(5)
print(b)

# 输出
tensor([[0., 0., 0.],
        [0., 0., 0.]])
tensor([0., 0., 0., 0., 0.])

与torch.zeros类似的还包括torch.ones（初始化全1张量）, torch.empty（获取未初始化的张量），这两个函数用法与torch.zeros类似，这里就不具体介绍了。

除此之外，与torch.zeros用法相似的还有一些随机初始化创建张量的函数，比如torch.rand（从均匀分布[0,1)中随机初始化变量）、torch.randn（从标准正态分布(0, 1)中随机初始化变量）等。这里也不再具体介绍了。

`torch.zeros_like(input, dtype=None, layout=None, device=None, requires_grad=False, memory_format=torch.preserve_format) → Tensor`

该函数用于生成与input张量维度相同的全零张量。
部分参数说明：

input：必须为张量，输出的张量维度与该input的维度相同。
memory_format：表示返回的内存格式（memory format），默认为torch.preserve_format，表示与input的内存格式相同。
其余参数和之前的函数说明一致，这里就不具体介绍了。

a = torch.empty(2, 3)
print(torch.zeros_like(a))

# 输出
tensor([[0., 0., 0.],
        [0., 0., 0.]])

与torch.zeros_like类似的还包括torch.ones_like（按给定张量形状初始化全1张量）, torch.empty_like（按给定张量形状获取未初始化的张量），这两个函数用法与torch.zeros_like类似，这里就不具体介绍了。

除此之外，与torch.zeros_like用法相似的还有一些随机初始化创建张量的函数，比如torch.rand_like（给定张量形状，从均匀分布[0,1)中随机初始化变量）、torch.randn_like（给定张量形状，从标准正态分布(0, 1)中随机初始化变量）等。这里也不再具体介绍了。

`torch.arange(start=0, end, step=1, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor`

按照步长step在返回[start, end)之前生成一维张量（1-D Tensor）。
部分参数说明：

start：起始位置，默认为0，可以是整型，也可以是浮点型。
end：终止位置，但是不包含end这个值（区间为左闭右开）。没有默认值，可以是整型，也可以是浮点型。
step：表示步长，意思就是每两个点之间的距离。默认为1。
其余参数和之前的函数说明一致，这里就不具体介绍了。
【注意】torch.range已经弃用了，目前最好使用torch.arange，两个函数功能相同。

print(torch.arange(5))
print(torch.arange(1, 4))
print(torch.arange(1, 2.51, 0.5))

# 输出
tensor([0, 1, 2, 3, 4])
tensor([1, 2, 3])
tensor([1.0000, 1.5000, 2.0000, 2.5000])

2.3 索引、分片、连接、改变操作

`torch.cat(tensors, dim=0, out=None) → Tensor`

将一组张量的序列tensors拼接在一起，得到一个新的张量。torch.cat()可以认为是torch.split()和torch.chunk()的逆操作。
参数说明：

tensors：一组张量组成的Python序列，除了需要连接的维度以外，其余的维度必须相同。
dim：一个整数，表示这些张量要在哪个维度上拼接起来。这个从数组的最外层括号到最内层括号是从0~n的。具体可以看下面的例子
out：输出张量，可以不填。

a = torch.tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(a)
print(torch.cat([a, a, a], dim=0))
print(torch.cat((a, a), dim=1))
print(torch.cat((a, a), dim=-1))

# 输出
tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])
tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9],
        [1, 2, 3],
        [4, 5, 6],
        [7, 8, 9],
        [1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])
tensor([[1, 2, 3, 1, 2, 3],
        [4, 5, 6, 4, 5, 6],
        [7, 8, 9, 7, 8, 9]])
tensor([[1, 2, 3, 1, 2, 3],
        [4, 5, 6, 4, 5, 6],
        [7, 8, 9, 7, 8, 9]])

从上面可以看到，a是一个3*3的二维张量。如果使用dim=0来拼接，表示按照数组的最外层来拼接，可以直观的看到这表示按照行来拼接，因此最后的维度为：9*3。相比来看，如果是使用dim=1来拼接，表示按照数组的第二层来拼接，在我们这里就是按照列拼接。最后结果维度为3*9。按照dim=-1拼接表示按照最内层的拼接。
为了更近一步的说明。我们假设一个张量的形状为（2, 3, 4)，理论来讲，将两个该矩阵拼在一起，按照dim=0来拼接表示输出为(4, 3, 4)；按照dim=1来拼接表示输出为(2, 6, 4)；按照dim=2来拼接表示输出为(2, 3, 8)。实验如下：

a = torch.tensor(np.arange(24).reshape(2, 3, 4))
print(a.shape)
print(torch.cat([a, a], dim=0).size())
print(torch.cat([a, a], dim=1).size())
print(torch.cat([a, a], dim=2).size())
print(torch.cat([a, a], dim=2))

# 输出
torch.Size([2, 3, 4])
torch.Size([4, 3, 4])
torch.Size([2, 6, 4])
torch.Size([2, 3, 8])
tensor([[[ 0,  1,  2,  3,  0,  1,  2,  3],
         [ 4,  5,  6,  7,  4,  5,  6,  7],
         [ 8,  9, 10, 11,  8,  9, 10, 11]],

        [[12, 13, 14, 15, 12, 13, 14, 15],
         [16, 17, 18, 19, 16, 17, 18, 19],
         [20, 21, 22, 23, 20, 21, 22, 23]]], dtype=torch.int32)

上述结果与我们预期一致。

`torch.chunk(input, chunks, dim=0) → List of Tensors`

和torch.cat互为逆操作。目的是按照给定的维度将一个张量切成chunks块。当最后一块可能会比前面的块小（当dim这个维度的维数不能被chunks整除时会出现上述情况）。最终返回这些chunks块组成的Python元组。

a = torch.tensor([[1, 2, 3, 4], [4, 5, 6, 7], [7, 8, 9, 10]])
print(a)  # shape: (3, 4)
print(torch.chunk(a, 2, dim=0))  # shape: [(2, 4), (1, 4)]
print(torch.chunk(a, 2, dim=1))  # shape: [(3, 2), (3, 2)]
print(type(torch.chunk(a, 2, dim=1)))

# 输出
tensor([[ 1,  2,  3,  4],
        [ 4,  5,  6,  7],
        [ 7,  8,  9, 10]])
(tensor([[1, 2, 3, 4],
        [4, 5, 6, 7]]), tensor([[ 7,  8,  9, 10]]))
(tensor([[1, 2],
        [4, 5],
        [7, 8]]), tensor([[ 3,  4],
        [ 6,  7],
        [ 9, 10]]))

`torch.reshape(input, shape) → Tensor`

将张量input重塑成其他形状，但是需要保证重塑前后的元素个数相同，否则会报错。

a = torch.arange(12)
print(a)
print(torch.reshape(a, (3, 4)))
print(torch.reshape(a, (-1, 2)))

# 输出
tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])
tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]])
tensor([[ 0,  1],
        [ 2,  3],
        [ 4,  5],
        [ 6,  7],
        [ 8,  9],
        [10, 11]])

【注意】torch.reshape与torch.view的区别与联系：
首先他们都是用于将给定的张量重塑成固定的形状的。但是torch.reshape比torch.view的可用范围更广一些。下面列几点他们各自的特点：

当输入张量input是连续张量（Contiguous tensor）时，两者都会返回input张量的一个引用（都是浅拷贝）。
当输入张量input不是连续张量时，torch.view会报错，但是torch.reshape不报错，并且会返回一个重塑形状之后的新的对象（深拷贝）。
torch.view总是返回浅拷贝，但是不是所有时候都能正常应用（当输入不是连续张量时）
torch.reshape可能浅拷贝，也可能深拷贝，但是保证所有正常情况的重塑形状都不会报错。
综合上面的特征，可以知道我们可以尽量使用torch.reshape。
下面的问题就是到底什么是连续张量（Contiguous）张量了。首先对于数组而言，C或C++大多数语言实现的数组在内存中是按照行优先存储的。而对于转置（Transpose）操作input.T或input.t()，实际这个操作是浅拷贝，但是这个对象就不是连续张量了。这个过程可以参考下面这篇文章中的讲解[6]。因此可以做一些实验如下：

z = torch.zeros(3, 2)
x = z.t()
z[0][0] = 100
print("New x: {}".format(x))
print("Reshape: {}".format(x.reshape(6)))
print("View: {}".format(x.view(6)))

# 输出
New x: tensor([[100.,   0.,   0.],
        [  0.,   0.,   0.]])
Reshape: tensor([100.,   0.,   0.,   0.,   0.,   0.])
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
 in 
      4 print("New x: {}".format(x))
      5 print("Reshape: {}".format(x.reshape(6)))
----> 6 print("View: {}".format(x.view(6)))

RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

上面的例子说明转置操作是浅拷贝，转置之后的对象可以使用reshape进行形状重塑，但是当使用view时就会报compatible的错误。所以在实际环境中如果不确定两者之间的区别，或者使用view时候报了上面的错误。那么可以使用reshape来完成。

`torch.squeeze(input, dim=None, out=None) → Tensor`

移除所有维度为1的维度，返回的张量与原张量input共享内存空间（浅拷贝）。如果指定维度为dim，则只会移除给定dim（大小为1）。
举例来说，如果input的shape为(A, 1, B, C, 1, D)，如果dim=None则输出维度为(A, B, C, D)。当dim=0时，新张量维度与原始张量相同。当dim=1时，新张量维度为(A, B, C, 1, D)

t = torch.ones(2, 1, 3, 4, 1, 5)
print(t.size())  # (2, 1, 3, 4, 1, 5)
print(torch.squeeze(t).size())  # (2, 3, 4, 5)
print(torch.squeeze(t, 0).size())  # (2, 1, 3, 4, 1, 5)
print(torch.squeeze(t, 1).size())  # (2, 3, 4, 1, 5)

# 输出
torch.Size([2, 1, 3, 4, 1, 5])
torch.Size([2, 3, 4, 5])
torch.Size([2, 1, 3, 4, 1, 5])
torch.Size([2, 3, 4, 1, 5])

`torch.gather(input, dim, index, out=None, sparse_grad=False) → Tensor`

用于根据给定dim，按照index索引聚合元素，输出维度与index相同。同时index必须为n维张量，与input维度相同具体抽取规则如下：

out[i][j][k] = input[index[i][j][k]][j][k]  # if dim == 0
out[i][j][k] = input[i][index[i][j][k]][k]  # if dim == 1
out[i][j][k] = input[i][j][index[i][j][k]]  # if dim == 2

上面的规则有点长，容易把人绕晕。我们先通过实例来介绍一下结果是如何得到的，然后再回过头看一下上面的公式，就比较容易理解了。

t = torch.tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(t)
print(torch.gather(t, 0, torch.tensor([[1, 2, 0], [2, 0, 2], [0, 1, 1]])))

# 输出
tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])
tensor([[4, 8, 3],
        [7, 2, 9],
        [1, 5, 6]])

对上面的例子而言，首先dim=0表示针对第一维，也就是对行进行操作。
index的第一行是[1, 2, 0]，表示从input的每行中，按照选取第[1, 2, 0]个行下标对应的值。也就是输出的第0行为：第0行第1列的值（从下标为0开始算）、第1行第2列的值、第3行，第0列的值。因此为：[4, 8, 3]。同理，输出第1行使用的index下标为：[2, 0, 2]，因此输出为：第0行第2列、第1行第0列、第2行第2列的值，因此为:[7, 2, 9]。可以继续得到第2行输出为：[1, 5, 6]。最终得到正确结果。
反过来看上面的抽取规则。我们可以看出上面的例子使用的是dim==0对应的规则out[i][j][k] = input[index[i][j][k]][j][k]。结合例子可以理解：当dim=0时，输出的每个元素值，是其他维保持和input一致，只有第一维使用index中的值决定的。这里描述的比较拗口，大家自己结合例子自己理解一下。

三、数学操作

3.1 元素级操作

元素级操作表示针对张量的每个元素分别操作。最终的到新的张量。这样的操作主要包含求绝对值、加减乘除等操作。这部分只挑一些有代表性的介绍一下。

`torch.abs(input, out=None) → Tensor`

input张量中每个元素分别求绝对值，最终得到新的张量。

print(torch.abs(torch.tensor([1, -2, 3, -4, 0])))

# 输出
tensor([1, 2, 3, 4, 0])

`torch.add()`

主要包含元素级的张量相加操作，但是包含两种不同的重载，下面简单介绍下每种重载实现的功能是什么。

torch.add(input, other, out=None): out=input+other
torch.add(input, other, *, alpha=1, out=None): out=input+alpha×other

a = torch.tensor([[1, 2, 3], [4, 5, 6]], dtype=torch.float32)
print(a)
print(torch.add(a, a+1))
print(torch.add(a, a+1, alpha=0.1))

# 输出
tensor([[1., 2., 3.],
        [4., 5., 6.]])
tensor([[ 3.,  5.,  7.],
        [ 9., 11., 13.]])
tensor([[1.2000, 2.3000, 3.4000],
        [4.5000, 5.6000, 6.7000]])

torch.sub()表示减法，与torch.add()用法相同。这里不再过多介绍了。

`torch.mul(input, other, out=None)`

如果other为张量，则返回input和other对应项相乘的结果张量。如果other为标量，则将input每个元素乘以other。

a = torch.tensor([[1, 2, 3], [4, 5, 6]], dtype=torch.float32)
print(a)
print(torch.mul(a, a))
print(torch.mul(a, 2))

# 输出
tensor([[1., 2., 3.],
        [4., 5., 6.]])
tensor([[ 1.,  4.,  9.],
        [16., 25., 36.]])
tensor([[ 2.,  4.,  6.],
        [ 8., 10., 12.]])

torch.div()表示除法，与torch.mul()用法相同。这里不再过多介绍了。

`torch.sigmoid(input, out=None) → Tensor`

返回input张量的sigmoid张量。这个函数用处很广，因为很多模型都是使用sigmoid作为分类器的激活函数的，因此在这里说明一下。o = 1/(1+exp(-i)

a = torch.tensor([[1, 2, 3, 4], [5, 6, 7, 8]], dtype=torch.float32)
print(torch.sigmoid(a))

# 输出
tensor([[0.7311, 0.8808, 0.9526, 0.9820],
        [0.9933, 0.9975, 0.9991, 0.9997]])

3.2 约简操作（Reduction）

约简操作表示输出的张量是对输入张量进行整体的数学操作，比如求均值、求方差、求最小值、求最大值等等。

`torch.argmax(input, dim, keepdim=False) → LongTensor`

返回给定轴上的最大值对应的下标。（如果只写input参数表示在张量的所有元素中选择一个最大值对应的下标）。keepdim表示是否希望输出张量的维度个数与输入张量相同。

a = torch.tensor([[1, 2, 3], [4, 5, 6]], dtype=torch.float32)
print(a)
print(torch.argmax(a))
print(torch.argmax(a, dim=0))  # 将最外层括号约简掉，最后维度为(3, )
print(torch.argmax(a, dim=1))  # 将第二层的括号约简掉，最后维度为(2, )
print(torch.argmax(a, dim=1, keepdim=True)) # 最后维度为(2, 1) 

# 输出
tensor([[1., 2., 3.],
        [4., 5., 6.]])
tensor(5)
tensor([1, 1, 1])
tensor([2, 2])
tensor([[2],
        [2]])

其余用法类似的函数包括：

torch.argmin：选择给定轴上的最小值对应的下标
torch.mean: 求给定轴上的平均值
torch.median: 返回给定轴上的中位数
torch.std: 求给定轴上的标准差
torch.sum: 求给定轴上的和

3.3 比较操作

`torch.argsort(input, dim=-1, descending=False) → LongTensor`

返回在给定维度按照升序排序的下标组成的张量, descending参数可以是否设置按照降序来考虑。默认按照最内层（-1）来考虑。

a = torch.tensor([[1, 3, 2, 4], [9, 7, 5, 10]])
print(a)
print(torch.argsort(a))  # 按照行排序
print(torch.argsort(a, dim=0))  # 按照列排序
print(torch.argsort(a, dim=0, descending=False))  # 按照列降序排序
print(torch.argsort(a, dim=1))  # 在该例子中，与默认dim设置结果相同，按行排序

# 输出
tensor([[ 1,  3,  2,  4],
        [ 9,  7,  5, 10]])
tensor([[0, 2, 1, 3],
        [2, 1, 0, 3]])
tensor([[0, 0, 0, 0],
        [1, 1, 1, 1]])
tensor([[0, 0, 0, 0],
        [1, 1, 1, 1]])
tensor([[0, 2, 1, 3],
        [2, 1, 0, 3]])

`torch.sort(input, dim=-1, descending=False, out=None) -> (Tensor, LongTensor)`

将input张量中的元素在指定维度排序, descending参数可以是否设置按照降序来考虑。默认按照最内层（-1）来考虑。

a = torch.tensor([[1, 3, 2, 4], [9, 7, 5, 10]])
print(a)
print(torch.sort(a))  # 按照行排序
print(torch.sort(a, dim=0))  # 按照列排序
print(torch.sort(a, dim=0, descending=False))  # 按照列降序排序
print(torch.sort(a, dim=1))  # 在该例子中，与默认dim设置结果相同，按行排序

# 输出
tensor([[ 1,  3,  2,  4],
        [ 9,  7,  5, 10]])
torch.return_types.sort(
values=tensor([[ 1,  2,  3,  4],
        [ 5,  7,  9, 10]]),
indices=tensor([[0, 2, 1, 3],
        [2, 1, 0, 3]]))
torch.return_types.sort(
values=tensor([[ 1,  3,  2,  4],
        [ 9,  7,  5, 10]]),
indices=tensor([[0, 0, 0, 0],
        [1, 1, 1, 1]]))
torch.return_types.sort(
values=tensor([[ 1,  3,  2,  4],
        [ 9,  7,  5, 10]]),
indices=tensor([[0, 0, 0, 0],
        [1, 1, 1, 1]]))
torch.return_types.sort(
values=tensor([[ 1,  2,  3,  4],
        [ 5,  7,  9, 10]]),
indices=tensor([[0, 2, 1, 3],
        [2, 1, 0, 3]]))

返回命名数组(values, indices)，其中values表示给定维度上的排序，indices表示给定维度上输出值对应的下标（就是torch.argsort的结果）。

`torch.equal(input, other) → bool`

如果两个张量大小相同，每个元素值对应相同，则返回True。否则返回False。

print(torch.equal(torch.tensor([1, 2]), torch.tensor([1, 2])))  # True
print(torch.equal(torch.tensor([1, 2]), torch.tensor([1, 0])))  # False

# 输出
True
False

`torch.max(input, dim, keepdim=False, out=None) -> (Tensor, LongTensor)`

返回命名数组(values, indices)，其中values表示给定维度上的最大值，indices表示给定维度上最大值对应的下标（就是torch.argmax的结果）。
注意indices返回的下标不一定是最大值第一次出现时候的位置。这取决于使用CPU还是使用GPU，使用CPU和使用GPU的结果可能不同。

a = torch.tensor([[1, 2, 3], [4, 5, 6]], dtype=torch.float32)
print(a)
print(torch.max(a))
print(torch.max(a, dim=0))
print(torch.max(a, dim=1))
print(torch.max(a, dim=1, keepdim=True))

# 输出
tensor([[1., 2., 3.],
        [4., 5., 6.]])
tensor(6.)
torch.return_types.max(
values=tensor([4., 5., 6.]),
indices=tensor([1, 1, 1]))
torch.return_types.max(
values=tensor([3., 6.]),
indices=tensor([2, 2]))
torch.return_types.max(
values=tensor([[3.],
        [6.]]),
indices=tensor([[2],
        [2]]))

`torch.max(input, other, out=None) → Tensor`

输出input和other张量中对应位置元素的最大值，返回新的张量。

a = torch.rand(4)
b = torch.rand(4)
print(a)
print(b)
print(torch.max(a, b))

# 输出
tensor([0.3577, 0.6315, 0.8924, 0.9124])
tensor([0.9137, 0.6503, 0.9454, 0.5358])
tensor([0.9137, 0.6503, 0.9454, 0.9124])

与torch.max用法相似的还有torch.min（找最小值）。

四、总结

这一部分重点介绍了一些比较常用的张量相关的函数，这些不同分节可以理解张量操作的范围。所有相关函数可以参考官方文档。当然肯定不需要都记下来，只需要在需要的时候可以知道查找需要用的函数的用法即可。

PyTorch学习1 张量学习

目录