Torch 提供了 4 种序列化/反序列化 Lua/Torch objects 的方法.
torch.save(filename, object [, format, referenced])
将 object 写入 filename 文件.
format 可以是 ascii 或 binary(默认).
用例:
-- arbitrary object:
obj = {
mat = torch.randn(10,10),
name = '10',
test = {
entry = 1
}
}
-- save to disk:
torch.save('test.dat', obj)
[object] torch.load(filename [, format, referenced])
从 filename 文件读入 objects.
用例:
-- given serialized object from section above, reload:
obj = torch.load('test.dat')
print(obj)
-- will print:
-- {[mat] = DoubleTensor - size: 10x10
-- [name] = string : "10"
-- [test] = table - size: 0}
[str] torch.serialize(object)
将 object 序列化为 string.
用例:
-- arbitrary object:
obj = {
mat = torch.randn(10,10),
name = '10',
test = {
entry = 1
}
}
-- serialize:
str = torch.serialize(obj)
[object] torch.deserialize(str)
从 string 反序列化 object
用例:
-- given serialized object from section above, deserialize:
obj = torch.deserialize(str)
print(obj)
-- will print:
-- {[mat] = DoubleTensor - size: 10x10
-- [name] = string : "10"
-- [test] = table - size: 0}
Module 是抽象类,定义了训练神经网络所必要的基础方法,通过成员函数来设计网络结构.
主要包含变量的两种状态:output 和 gradInput.
[output] forward(input)
根据输入 input,计算对应的输出 output. 一般 input 和 output 是 Tensors.
不推荐修改 forward 函数. 需要实现 updateOutput(input) 函数.
[gradInput] backward(input, gradOutput)
根据输入 input,计算关于 input 的梯度操作;需要是在 forward 之后调用;网络优化;
一般 input, gradOutput, gradInput 是 Tensors.
BP 包含种类梯度的计算及对应的函数:
关于输入变量的梯度计算 - updateGradInput(input, gradOutput)
关于网络参数的梯度计算 - accGradParameters(input,gradOutput,scale), gradParamaters * scale,累加
zeroGradParameters()
如果网络 module 已经有参数,该函数将关于其参数的累积梯度gradParameters清零.
updateParameters(learningRate)
更新网络 module 参数:
parameters=parameters−learningRate∗gradients_wrt_parameters
accUpdateGradParameters(input, gradOutput, learningRate)
累积参数梯度,并更新参数.
share(mlp,s1,s2,…,sn)
修改 module s1,...,sn 的参数,使其与给定 module mlp 具有相同名字的层共享相同参数.
用例:
-- make an mlp
mlp1=nn.Sequential();
mlp1:add(nn.Linear(100,10));
-- make a second mlp
mlp2=nn.Sequential();
mlp2:add(nn.Linear(100,10));
-- the second mlp shares the bias of the first
mlp2:share(mlp1,'bias');
-- we change the bias of the first
mlp1:get(1).bias[1]=99;
-- and see that the second one's bias has also changed..
print(mlp2:get(1).bias[1])
clone(mlp,…)
深度复制 module,包括其参数的当前状态.
用例:
-- make an mlp
mlp1=nn.Sequential();
mlp1:add(nn.Linear(100,10));
-- make a copy that shares the weights and biases
mlp2=mlp1:clone('weight','bias');
-- we change the bias of the first mlp
mlp1:get(1).bias[1]=99;
-- and see that the second one's bias has also changed..
print(mlp2:get(1).bias[1])
type(type[, tensorCache])
将 module 的所有参数转化为设定 type,torch.Tensor 中的一种 type.
如果 tensors 是网络中的多种 modules 间共享的,则调用 type 会阻止其共享.
为了避免多种 modules 和多种 tensors 间的共享,采用 nn.module = Parallel(inputDimension,outputils.recursiveType:
用例:
-- make an mlp
mlp1=nn.Sequential();
mlp1:add(nn.Linear(100,10));
-- make a second mlp
mlp2=nn.Sequential();
mlp2:add(nn.Linear(100,10));
-- the second mlp shares the bias of the first
mlp2:share(mlp1,'bias');
-- mlp1 and mlp2 will be converted to float, and will share bias
-- note: tensors can be provided as inputs as well as modules
nn.utils.recursiveType({mlp1, mlp2}, 'torch.FloatTensor')
float([tensorCache])
便于调用 [module:type(‘torch.FloatTensor’, tensorCache])
double([tensorCache])
便于调用 [module:type(‘torch.DoubleTensor’, tensorCache])
cuda([tensorCache])
便于调用 [module:type(‘torch.CudaTensor’, tensorCache])
复杂神经网络可以采用 container 创建.
抽象类,包含了所有 containers 声明的方法.
add(module)
添加 module 到 container. 按照一定的 order.
get(index)
根据索引 index 获取 modules.
size()
返回包含的 modules 的数目.
以 feed-forward 全连接方式组织网络层.
用例:
mlp = nn.Sequential()
mlp:add(nn.Linear(10, 25)) -- Linear module (10 inputs, 25 hidden units)
mlp:add(nn.Tanh()) -- apply hyperbolic tangent transfer function on each hidden units
mlp:add(nn.Linear(25, 1)) -- Linear module (25 inputs, 1 output)
> mlp
-- nn.Sequential {
-- [input -> (1) -> (2) -> (3) -> output]
-- (1): nn.Linear(10 -> 25)
-- (2): nn.Tanh
-- (3): nn.Linear(25 -> 1)
--}
> print(mlp:forward(torch.randn(10)))
-- -0.1815
-- [torch.Tensor of dimension 1]
remove([index])
根据 index 删除 module. 如果 index 未指定,则删除最后一层.
用例:
model = nn.Sequential()
model:add(nn.Linear(10, 20))
model:add(nn.Linear(20, 20))
model:add(nn.Linear(20, 30))
model:remove(2)
> model
-- nn.Sequential {
-- [input -> (1) -> (2) -> output]
-- (1): nn.Linear(10 -> 20)
-- (2): nn.Linear(20 -> 30)
-- }
insert(module, [index])
根据 index 插入给定 module. 如果 index 未指定,则等价于 add(module).
用例:
model = nn.Sequential()
model:add(nn.Linear(10, 20))
model:add(nn.Linear(20, 30))
model:insert(nn.Linear(20, 20), 2)
> model
-- nn.Sequential {
-- [input -> (1) -> (2) -> (3) -> output]
-- (1): nn.Linear(10 -> 20)
-- (2): nn.Linear(20 -> 20) -- The inserted layer
-- (3): nn.Linear(20 -> 30)
-- }
用法:
module = Parallel(inputDimension,outputDimension)
创建 container module,将其 ith 子 module 应用输入 input Tensor 的 ith 分片,子 module 的划分是根据 inputDimension 来进行选择. 最后再将其包含的各子 modules 的结果根据 outputDimension 连接.
用例1:
mlp = nn.Parallel(2,1); -- Parallel container will associate a module to each slice of dimension 2
-- (column space), and concatenate the outputs over the 1st dimension.
mlp:add(nn.Linear(10,3)); -- Linear module (input 10, output 3), applied on 1st slice of dimension 2
mlp:add(nn.Linear(10,2)) -- Linear module (input 10, output 2), applied on 2nd slice of dimension 2
-- After going through the Linear module the outputs are
-- concatenated along the unique dimension, to form 1D Tensor
> mlp:forward(torch.randn(10,2)) -- of size 5.
-0.5300
-1.1015
0.7764
0.2819
-0.6026
[torch.Tensor of dimension 5]
用例2:
mlp = nn.Sequential();
c = nn.Parallel(1,2) -- Parallel container will associate a module to each slice of dimension 1
-- (row space), and concatenate the outputs over the 2nd dimension.
for i=1,10 do -- Add 10 Linear+Reshape modules in parallel (input = 3, output = 2x1)
local t=nn.Sequential()
t:add(nn.Linear(3,2)) -- Linear module (input = 3, output = 2)
t:add(nn.Reshape(2,1)) -- Reshape 1D Tensor of size 2 to 2D Tensor of size 2x1
c:add(t)
end
mlp:add(c) -- Add the Parallel container in the Sequential container
pred = mlp:forward(torch.randn(10,3)) -- 2D Tensor of size 10x3 goes through the Sequential container
-- which contains a Parallel container of 10 Linear+Reshape.
-- Each Linear+Reshape module receives a slice of dimension 1
-- which corresponds to a 1D Tensor of size 3.
-- Eventually all the Linear+Reshape modules' outputs of size 2x1
-- are concatenated alond the 2nd dimension (column space)
-- to form pred, a 2D Tensor of size 2x10.
> pred
-0.7987 -0.4677 -0.1602 -0.8060 1.1337 -0.4781 0.1990 0.2665 -0.1364 0.8109
-0.2135 -0.3815 0.3964 -0.4078 0.0516 -0.5029 -0.9783 -0.5826 0.4474 0.6092
[torch.DoubleTensor of size 2x10]
for i = 1, 10000 do -- Train for a few iterations
x = torch.randn(10,3);
y = torch.ones(2,10);
pred = mlp:forward(x)
criterion = nn.MSECriterion()
local err = criterion:forward(pred,y)
local gradCriterion = criterion:backward(pred,y);
mlp:zeroGradParameters();
mlp:backward(x, gradCriterion);
mlp:updateParameters(0.01);
print(err)
end
用法:
module = nn.Concat(dim)
根据提供的 dim 将 parallel module 的输出连接:相同的输入,连接各 module 的输出.
用例:
mlp = nn.Concat(1);
mlp:add(nn.Linear(5,3))
mlp:add(nn.Linear(5,7))
> print(mlp:forward(torch.randn(5)))
0.7486
0.1349
0.7924
-0.0371
-0.4794
0.3044
-0.0835
-0.7928
0.7856
-0.1815
[torch.Tensor of dimension 10]
用法:
module = nn.WeightNorm(module)
w=g∗v/||v||
用法:
dmodule = nn.NaN(module, [id])
NaN module 设定 module 的 output 和 gradInput 不包含 NaNs. 对于确定 Nan 错误很有用. id 默认为 1,2,3,... .
用例:
linear = nn.Linear(3,4)
mlp = nn.Sequential()
mlp:add(nn.NaN(nn.Identity()))
mlp:add(nn.NaN(linear))
mlp:add(nn.NaN(nn.Linear(4,2)))
print(mlp)
-- nn.Sequential {
-- [input -> (1) -> (2) -> (3) -> output]
-- (1): nn.NaN(1) @ nn.Identity
-- (2): nn.NaN(2) @ nn.Linear(3 -> 4)
-- (3): nn.NaN(3) @ nn.Linear(4 -> 2)
-- }
用法:
module = nn.ConcatTable()
对每个成员模块应用相同输入
+-----------+
+----> {member1, |
+-------+ | | |
| input +----+----> member2, |
+-------+ | | |
or +----> member3} |
{input} +-----------+
用例:
mlp = nn.ConcatTable()
mlp:add(nn.Linear(5, 2))
mlp:add(nn.Linear(5, 3))
pred = mlp:forward(torch.randn(5))
for i, k in ipairs(pred) do print(i, k) end
用法:
module = nn.ParallelTable()
对每个成员模块应用与之对应的输入(第i个模块应用第i个输入).
+----------+ +-----------+
| {input1, +---------> {member1, |
| | | |
| input2, +---------> member2, |
| | | |
| input3} +---------> member3} |
+----------+ +-----------+
用例:
mlp = nn.ParallelTable()
mlp:add(nn.Linear(10, 2))
mlp:add(nn.Linear(5, 3))
x = torch.randn(10)
y = torch.rand(5)
pred = mlp:forward{x, y}
for i, k in pairs(pred) do print(i, k) end
用法:
module = nn.MapTable(m, share)
对所有输入应用的单个模块,不足的进行clone.
+----------+ +-----------+
| {input1, +---------> {member, |
| | | |
| input2, +---------> clone, |
| | | |
| input3} +---------> clone} |
+----------+ +-----------+
用例:
map = nn.MapTable()
map:add(nn.Linear(10, 3))
x1 = torch.rand(10)
x2 = torch.rand(10)
y = map:forward{x1, x2}
for i, k in pairs(y) do print(i, k) end
用法:
module = SplitTable(dimension, nInputDims)
+----------+ +-----------+
| input[1] +---------> {member1, |
+----------+-+ | |
| input[2] +-----------> member2, |
+----------+-+ | |
| input[3] +-------------> member3} |
+----------+ +-----------+
用例:
mlp = nn.SplitTable(2)
x = torch.randn(4, 3)
pred = mlp:forward(x)
for i, k in ipairs(pred) do print(i, k) end
用法:
module = JoinTable(dimension, nInputDims)
对每个成员模块应用相同输入.
+----------+ +-----------+
| {input1, +-------------> output[1] |
| | +-----------+-+
| input2, +-----------> output[2] |
| | +-----------+-+
| input3} +---------> output[3] |
+----------+ +-----------+
用例:
x = torch.randn(5, 1)
y = torch.randn(5, 1)
z = torch.randn(2, 1)
print(nn.JoinTable(1):forward{x, y})
print(nn.JoinTable(2):forward{x, y})
print(nn.JoinTable(1):forward{x, z})
[1] - Neural Network Package