pytorch中torch.max和F.softmax函数的维度解释

在利用torch.max函数和F.Ssoftmax函数时,对应该设置什么维度,总是有点懵,遂总结一下:

首先看看二维tensor的函数的例子:

import torch
import torch.nn.functional as F

input = torch.randn(3,4)
print(input)
tensor([[-0.5526, -0.0194,  2.1469, -0.2567],
        [-0.3337, -0.9229,  0.0376, -0.0801],
        [ 1.4721,  0.1181, -2.6214,  1.7721]])

b = F.softmax(input,dim=0) # 按列SoftMax,列和为1
print(b)
tensor([[0.1018, 0.3918, 0.8851, 0.1021],
        [0.1268, 0.1587, 0.1074, 0.1218],
        [0.7714, 0.4495, 0.0075, 0.7762]])

c = F.softmax(input,dim=1)   # 按行SoftMax,行和为1
print(c)
tensor([[0.0529, 0.0901, 0.7860, 0.0710],
        [0.2329, 0.1292, 0.3377, 0.3002],
        [0.3810, 0.0984, 0.0064, 0.5143]])

d = torch.max(input,dim=0)    # 按列取max,
print(d)
torch.return_types.max(
values=tensor([1.4721, 0.1181, 2.1469, 1.7721]),
indices=tensor([2, 2, 0, 2]))

e = torch.max(input,dim=1)   # 按行取max,
print(e)
torch.return_types.max(
values=tensor([2.1469, 0.0376, 1.7721]),
indices=tensor([2, 2, 3]))

下面看看三维tensor解释例子:

函数softmax输出的是所给矩阵的概率分布;

b输出的是在dim=0维上的概率分布,b[0][5][6]+b[1][5][6]+b[2][5][6]=1

a=torch.rand(3,16,20)

b=F.softmax(a,dim=0)

c=F.softmax(a,dim=1)

d=F.softmax(a,dim=2)


In [1]: import torch as t

 

In [2]: import torch.nn.functional as F

 

In [4]: a=t.Tensor(3,4,5)

 

In [5]: b=F.softmax(a,dim=0)

 

In [6]: c=F.softmax(a,dim=1)

 

In [7]: d=F.softmax(a,dim=2)

 

In [8]: a

Out[8]: 

tensor([[[-0.1581,  0.0000,  0.0000,  0.0000, -0.0344],

         [ 0.0000, -0.0344,  0.0000, -0.0344,  0.0000],

         [-0.0344,  0.0000, -0.0344,  0.0000, -0.0344],

         [ 0.0000, -0.0344,  0.0000, -0.0344,  0.0000]],

 

        [[-0.0344,  0.0000, -0.0344,  0.0000, -0.0344],

         [ 0.0000, -0.0344,  0.0000, -0.0344,  0.0000],

         [-0.0344,  0.0000, -0.0344,  0.0000, -0.0344],

         [ 0.0000, -0.0344,  0.0000, -0.0344,  0.0000]],

 

        [[-0.0344,  0.0000, -0.0344,  0.0000, -0.0344],

         [ 0.0000, -0.0344,  0.0000, -0.0344,  0.0000],

         [-0.0344,  0.0000, -0.0344,  0.0000, -0.0344],

         [ 0.0000, -0.0344,  0.0000, -0.0344,  0.0000]]])

 

In [9]: b

Out[9]: 

tensor([[[0.3064, 0.3333, 0.3410, 0.3333, 0.3333],

         [0.3333, 0.3333, 0.3333, 0.3333, 0.3333],

         [0.3333, 0.3333, 0.3333, 0.3333, 0.3333],

         [0.3333, 0.3333, 0.3333, 0.3333, 0.3333]],

 

        [[0.3468, 0.3333, 0.3295, 0.3333, 0.3333],

         [0.3333, 0.3333, 0.3333, 0.3333, 0.3333],

         [0.3333, 0.3333, 0.3333, 0.3333, 0.3333],

         [0.3333, 0.3333, 0.3333, 0.3333, 0.3333]],

 

        [[0.3468, 0.3333, 0.3295, 0.3333, 0.3333],

         [0.3333, 0.3333, 0.3333, 0.3333, 0.3333],

         [0.3333, 0.3333, 0.3333, 0.3333, 0.3333],

         [0.3333, 0.3333, 0.3333, 0.3333, 0.3333]]])

 

In [10]: b.sum()

Out[10]: tensor(20.0000)

 

In [11]: b[0][0][0]+b[1][0][0]+b[2][0][0]

Out[11]: tensor(1.0000)

 

In [12]: c.sum()

Out[12]: tensor(15.)

 

In [13]: c

Out[13]: 

tensor([[[0.2235, 0.2543, 0.2521, 0.2543, 0.2457],

         [0.2618, 0.2457, 0.2521, 0.2457, 0.2543],

         [0.2529, 0.2543, 0.2436, 0.2543, 0.2457],

         [0.2618, 0.2457, 0.2521, 0.2457, 0.2543]],

 

        [[0.2457, 0.2543, 0.2457, 0.2543, 0.2457],

         [0.2543, 0.2457, 0.2543, 0.2457, 0.2543],

         [0.2457, 0.2543, 0.2457, 0.2543, 0.2457],

         [0.2543, 0.2457, 0.2543, 0.2457, 0.2543]],

 

        [[0.2457, 0.2543, 0.2457, 0.2543, 0.2457],

         [0.2543, 0.2457, 0.2543, 0.2457, 0.2543],

         [0.2457, 0.2543, 0.2457, 0.2543, 0.2457],

         [0.2543, 0.2457, 0.2543, 0.2457, 0.2543]]])

 

In [14]: n=t.rand(3,4)

 

In [15]: n

Out[15]: 

tensor([[0.2769, 0.3475, 0.8914, 0.6845],

        [0.9251, 0.3976, 0.8690, 0.4510],

        [0.8249, 0.1157, 0.3075, 0.3799]])

 

In [16]: m=t.argmax(n,dim=0)

 

In [17]: m

Out[17]: tensor([1, 1, 0, 0])

 

In [18]: p=t.argmax(n,dim=1)

 

In [19]: p

Out[19]: tensor([2, 0, 0])

 

In [20]: d.sum()

Out[20]: tensor(12.0000)

 

In [22]: d

Out[22]: 

tensor([[[0.1771, 0.2075, 0.2075, 0.2075, 0.2005],

         [0.2027, 0.1959, 0.2027, 0.1959, 0.2027],

         [0.1972, 0.2041, 0.1972, 0.2041, 0.1972],

         [0.2027, 0.1959, 0.2027, 0.1959, 0.2027]],

 

        [[0.1972, 0.2041, 0.1972, 0.2041, 0.1972],

         [0.2027, 0.1959, 0.2027, 0.1959, 0.2027],

         [0.1972, 0.2041, 0.1972, 0.2041, 0.1972],

         [0.2027, 0.1959, 0.2027, 0.1959, 0.2027]],

 

        [[0.1972, 0.2041, 0.1972, 0.2041, 0.1972],

         [0.2027, 0.1959, 0.2027, 0.1959, 0.2027],

         [0.1972, 0.2041, 0.1972, 0.2041, 0.1972],

         [0.2027, 0.1959, 0.2027, 0.1959, 0.2027]]])

 

In [23]: d[0][0].sum()

Out[23]: tensor(1.)

 

你可能感兴趣的:(Python学习)