Torch - nn 模块学习

Torch 模块学习

1. Serialization

Torch 提供了 4 种序列化/反序列化 Lua/Torch objects 的方法.

  • torch.save(filename, object [, format, referenced])

    将 object 写入 filename 文件.

    format 可以是 ascii 或 binary(默认).

    用例:

    -- arbitrary object:
    obj = {
     mat = torch.randn(10,10),
     name = '10',
     test = {
        entry = 1
     }
    }
    
    -- save to disk:
    torch.save('test.dat', obj)
  • [object] torch.load(filename [, format, referenced])

    从 filename 文件读入 objects.

    用例:

    -- given serialized object from section above, reload:
    obj = torch.load('test.dat')
    
    print(obj)
    -- will print:
    -- {[mat]  = DoubleTensor - size: 10x10
    --  [name] = string : "10"
    --  [test] = table - size: 0}
  • [str] torch.serialize(object)

    将 object 序列化为 string.

    用例:

    -- arbitrary object:
    obj = {
     mat = torch.randn(10,10),
     name = '10',
     test = {
        entry = 1
     }
    }
    
    -- serialize:
    str = torch.serialize(obj)
  • [object] torch.deserialize(str)

    从 string 反序列化 object

    用例:

    -- given serialized object from section above, deserialize:
    obj = torch.deserialize(str)
    
    print(obj)
    -- will print:
    -- {[mat]  = DoubleTensor - size: 10x10
    --  [name] = string : "10"
    --  [test] = table - size: 0}

2. Module

Module 是抽象类,定义了训练神经网络所必要的基础方法,通过成员函数来设计网络结构.

主要包含变量的两种状态:output 和 gradInput.

  • [output] forward(input)

    根据输入 input,计算对应的输出 output. 一般 input 和 output 是 Tensors.

    不推荐修改 forward 函数. 需要实现 updateOutput(input) 函数.

  • [gradInput] backward(input, gradOutput)

    根据输入 input,计算关于 input 的梯度操作;需要是在 forward 之后调用;网络优化;

    一般 input, gradOutput, gradInput 是 Tensors.

    BP 包含种类梯度的计算及对应的函数:

    关于输入变量的梯度计算 - updateGradInput(input, gradOutput)

    关于网络参数的梯度计算 - accGradParameters(input,gradOutput,scale), gradParamaters * scale,累加

  • zeroGradParameters()

    如果网络 module 已经有参数,该函数将关于其参数的累积梯度gradParameters清零.

  • updateParameters(learningRate)

    更新网络 module 参数:

    parameters=parameterslearningRategradients_wrt_parameters

  • accUpdateGradParameters(input, gradOutput, learningRate)

    累积参数梯度,并更新参数.

  • share(mlp,s1,s2,…,sn)

    修改 module s1,...,sn 的参数,使其与给定 module mlp 具有相同名字的层共享相同参数.

    用例:

    -- make an mlp
    mlp1=nn.Sequential();
    mlp1:add(nn.Linear(100,10));
    
    -- make a second mlp
    mlp2=nn.Sequential();
    mlp2:add(nn.Linear(100,10));
    
    -- the second mlp shares the bias of the first
    mlp2:share(mlp1,'bias');
    
    -- we change the bias of the first
    mlp1:get(1).bias[1]=99;
    
    -- and see that the second one's bias has also changed..
    print(mlp2:get(1).bias[1])
  • clone(mlp,…)

    深度复制 module,包括其参数的当前状态.

    用例:

    -- make an mlp
    mlp1=nn.Sequential();
    mlp1:add(nn.Linear(100,10));
    
    -- make a copy that shares the weights and biases
    mlp2=mlp1:clone('weight','bias');
    
    -- we change the bias of the first mlp
    mlp1:get(1).bias[1]=99;
    
    -- and see that the second one's bias has also changed..
    print(mlp2:get(1).bias[1])
  • type(type[, tensorCache])

    将 module 的所有参数转化为设定 type,torch.Tensor 中的一种 type.

    如果 tensors 是网络中的多种 modules 间共享的,则调用 type 会阻止其共享.

    为了避免多种 modules 和多种 tensors 间的共享,采用 nn.module = Parallel(inputDimension,outputils.recursiveType:

    用例:

    -- make an mlp
    mlp1=nn.Sequential();
    mlp1:add(nn.Linear(100,10));
    
    -- make a second mlp
    mlp2=nn.Sequential();
    mlp2:add(nn.Linear(100,10));
    
    -- the second mlp shares the bias of the first
    mlp2:share(mlp1,'bias');
    
    -- mlp1 and mlp2 will be converted to float, and will share bias
    -- note: tensors can be provided as inputs as well as modules
    nn.utils.recursiveType({mlp1, mlp2}, 'torch.FloatTensor')
  • float([tensorCache])

    便于调用 [module:type(‘torch.FloatTensor’, tensorCache])

  • double([tensorCache])

    便于调用 [module:type(‘torch.DoubleTensor’, tensorCache])

  • cuda([tensorCache])

    便于调用 [module:type(‘torch.CudaTensor’, tensorCache])

3. Containers

复杂神经网络可以采用 container 创建.

3.1. Container

抽象类,包含了所有 containers 声明的方法.

  • add(module)

    添加 module 到 container. 按照一定的 order.

  • get(index)

    根据索引 index 获取 modules.

  • size()

    返回包含的 modules 的数目.

3.2 Sequential

以 feed-forward 全连接方式组织网络层.

用例:

mlp = nn.Sequential()
mlp:add(nn.Linear(10, 25)) -- Linear module (10 inputs, 25 hidden units)
mlp:add(nn.Tanh())         -- apply hyperbolic tangent transfer function on each hidden units
mlp:add(nn.Linear(25, 1))  -- Linear module (25 inputs, 1 output)
> mlp
-- nn.Sequential {
--  [input -> (1) -> (2) -> (3) -> output]
--  (1): nn.Linear(10 -> 25)
--  (2): nn.Tanh
--  (3): nn.Linear(25 -> 1)
--}

> print(mlp:forward(torch.randn(10)))
-- -0.1815
-- [torch.Tensor of dimension 1]
  • remove([index])

    根据 index 删除 module. 如果 index 未指定,则删除最后一层.

    用例:

    model = nn.Sequential()
    model:add(nn.Linear(10, 20))
    model:add(nn.Linear(20, 20))
    model:add(nn.Linear(20, 30))
    model:remove(2)
    > model
    -- nn.Sequential {
    --   [input -> (1) -> (2) -> output]
    --   (1): nn.Linear(10 -> 20)
    --   (2): nn.Linear(20 -> 30)
    -- }
  • insert(module, [index])

    根据 index 插入给定 module. 如果 index 未指定,则等价于 add(module).

    用例:

    model = nn.Sequential()
    model:add(nn.Linear(10, 20))
    model:add(nn.Linear(20, 30))
    model:insert(nn.Linear(20, 20), 2)
    > model
    -- nn.Sequential {
    --   [input -> (1) -> (2) -> (3) -> output]
    --   (1): nn.Linear(10 -> 20)
    --   (2): nn.Linear(20 -> 20)      -- The inserted layer
    --   (3): nn.Linear(20 -> 30)
    -- }

3.3 Parallel

用法:

module = Parallel(inputDimension,outputDimension)

创建 container module,将其 ith 子 module 应用输入 input Tensor 的 ith 分片,子 module 的划分是根据 inputDimension 来进行选择. 最后再将其包含的各子 modules 的结果根据 outputDimension 连接.

用例1:

mlp = nn.Parallel(2,1);   -- Parallel container will associate a module to each slice of dimension 2
                           -- (column space), and concatenate the outputs over the 1st dimension.

mlp:add(nn.Linear(10,3)); -- Linear module (input 10, output 3), applied on 1st slice of dimension 2
mlp:add(nn.Linear(10,2))  -- Linear module (input 10, output 2), applied on 2nd slice of dimension 2

                                  -- After going through the Linear module the outputs are
                                  -- concatenated along the unique dimension, to form 1D Tensor
> mlp:forward(torch.randn(10,2)) -- of size 5.
-0.5300
-1.1015
 0.7764
 0.2819
-0.6026
[torch.Tensor of dimension 5]

用例2:

mlp = nn.Sequential();
c = nn.Parallel(1,2)     -- Parallel container will associate a module to each slice of dimension 1
                         -- (row space), and concatenate the outputs over the 2nd dimension.

for i=1,10 do            -- Add 10 Linear+Reshape modules in parallel (input = 3, output = 2x1)
 local t=nn.Sequential()
 t:add(nn.Linear(3,2))   -- Linear module (input = 3, output = 2)
 t:add(nn.Reshape(2,1))  -- Reshape 1D Tensor of size 2 to 2D Tensor of size 2x1
 c:add(t)
end

mlp:add(c)               -- Add the Parallel container in the Sequential container

pred = mlp:forward(torch.randn(10,3)) -- 2D Tensor of size 10x3 goes through the Sequential container
                                      -- which contains a Parallel container of 10 Linear+Reshape.
                                      -- Each Linear+Reshape module receives a slice of dimension 1
                                      -- which corresponds to a 1D Tensor of size 3.
                                      -- Eventually all the Linear+Reshape modules' outputs of size 2x1
                                      -- are concatenated alond the 2nd dimension (column space)
                                      -- to form pred, a 2D Tensor of size 2x10.

> pred
-0.7987 -0.4677 -0.1602 -0.8060  1.1337 -0.4781  0.1990  0.2665 -0.1364  0.8109
-0.2135 -0.3815  0.3964 -0.4078  0.0516 -0.5029 -0.9783 -0.5826  0.4474  0.6092
[torch.DoubleTensor of size 2x10]


for i = 1, 10000 do     -- Train for a few iterations
 x = torch.randn(10,3);
 y = torch.ones(2,10);
 pred = mlp:forward(x)

 criterion = nn.MSECriterion()
 local err = criterion:forward(pred,y)
 local gradCriterion = criterion:backward(pred,y);
 mlp:zeroGradParameters();
 mlp:backward(x, gradCriterion);
 mlp:updateParameters(0.01);
 print(err)
end

3.4 Concat

用法:

module = nn.Concat(dim)

根据提供的 dim 将 parallel module 的输出连接:相同的输入,连接各 module 的输出.

用例:

mlp = nn.Concat(1);
mlp:add(nn.Linear(5,3))
mlp:add(nn.Linear(5,7))

> print(mlp:forward(torch.randn(5)))
 0.7486
 0.1349
 0.7924
-0.0371
-0.4794
 0.3044
-0.0835
-0.7928
 0.7856
-0.1815
[torch.Tensor of dimension 10]

3.5 Weight Normalization

用法:

module = nn.WeightNorm(module)

w=gv/||v||

3.6 NaN

用法:

dmodule = nn.NaN(module, [id])

NaN module 设定 module 的 output 和 gradInput 不包含 NaNs. 对于确定 Nan 错误很有用. id 默认为 1,2,3,... .

用例:

linear = nn.Linear(3,4)
mlp = nn.Sequential()
mlp:add(nn.NaN(nn.Identity()))
mlp:add(nn.NaN(linear))
mlp:add(nn.NaN(nn.Linear(4,2)))
print(mlp)
-- nn.Sequential {
--   [input -> (1) -> (2) -> (3) -> output]
--   (1): nn.NaN(1) @ nn.Identity
--   (2): nn.NaN(2) @ nn.Linear(3 -> 4)
--   (3): nn.NaN(3) @ nn.Linear(4 -> 2)
-- }

4. Table Layer

4.1 ConcatTable

用法:

module = nn.ConcatTable()

对每个成员模块应用相同输入

                  +-----------+
             +----> {member1, |
+-------+    |    |           |
| input +----+---->  member2, |
+-------+    |    |           |
   or        +---->  member3} |
 {input}          +-----------+

用例:

mlp = nn.ConcatTable()
mlp:add(nn.Linear(5, 2))
mlp:add(nn.Linear(5, 3))

pred = mlp:forward(torch.randn(5))
for i, k in ipairs(pred) do print(i, k) end

4.2 ParallelTable

用法:

module = nn.ParallelTable()

对每个成员模块应用与之对应的输入(第i个模块应用第i个输入).

+----------+         +-----------+
| {input1, +---------> {member1, |
|          |         |           |
|  input2, +--------->  member2, |
|          |         |           |
|  input3} +--------->  member3} |
+----------+         +-----------+

用例:

mlp = nn.ParallelTable()
mlp:add(nn.Linear(10, 2))
mlp:add(nn.Linear(5, 3))

x = torch.randn(10)
y = torch.rand(5)

pred = mlp:forward{x, y}
for i, k in pairs(pred) do print(i, k) end

4.3 MapTable

用法:

module = nn.MapTable(m, share)

对所有输入应用的单个模块,不足的进行clone.

+----------+         +-----------+
| {input1, +---------> {member,  |
|          |         |           |
|  input2, +--------->  clone,   |
|          |         |           |
|  input3} +--------->  clone}   |
+----------+         +-----------+

用例:

map = nn.MapTable()
map:add(nn.Linear(10, 3))

x1 = torch.rand(10)
x2 = torch.rand(10)
y = map:forward{x1, x2}

for i, k in pairs(y) do print(i, k) end

4.4 SplitTable

用法:

module = SplitTable(dimension, nInputDims)
    +----------+         +-----------+
    | input[1] +---------> {member1, |
  +----------+-+         |           |
  | input[2] +----------->  member2, |
+----------+-+           |           |
| input[3] +------------->  member3} |
+----------+             +-----------+

用例:

mlp = nn.SplitTable(2)
x = torch.randn(4, 3)
pred = mlp:forward(x)
for i, k in ipairs(pred) do print(i, k) end

4.5 JoinTable

用法:

module = JoinTable(dimension, nInputDims)

对每个成员模块应用相同输入.

+----------+             +-----------+
| {input1, +-------------> output[1] |
|          |           +-----------+-+
|  input2, +-----------> output[2] |
|          |         +-----------+-+
|  input3} +---------> output[3] |
+----------+         +-----------+

用例:

x = torch.randn(5, 1)
y = torch.randn(5, 1)
z = torch.randn(2, 1)

print(nn.JoinTable(1):forward{x, y})
print(nn.JoinTable(2):forward{x, y})
print(nn.JoinTable(1):forward{x, z})

5. Convolutional layers

6. Criterions

7. Reference

[1] - Neural Network Package

你可能感兴趣的:(Torch)