Torch7学习(六)——学习神经网络包的用法(4)——利用optim进行训练

torch7学习(一)——Tensor
Torch7学习(二) —— Torch与Matlab的语法对比
Torch7学习(三)——学习神经网络包的用法(1)
Torch7学习(四)——学习神经网络包的用法(2)
Torch7学习(五)——学习神经网路包的用法(3)
Torch7学习(六)——学习神经网络包的用法(4)——利用optim进行训练
Torch7学习(七)——从neural-style代码中看自定义重载函数的训练方式

总说

这篇博客是本系列的最后一篇,着重讲如何利用optim包进行自动挡训练。

-- standard training code
-- Here let's train XOR net.

require 'torch'
require 'cunn'
require 'cutorch'
require 'nn'
require 'optim'

-- dataset generate
local minibatches = 128
local trainset = torch.randn(minibatches,2)
local trainlabels = torch.zeros(minibatches)


for i = 1,minibatches do
    local inputs = torch.randn(2)
    if inputs[1]*inputs[1] > 0 then
        trainlabels[i] = -1
    else
        trainlabels[i] = 1
    end
  -- Every example shuold be copy to trainset like this!
  -- So that the first dim will be regraded as the number of one minibatch.
  trainset[i]:copy(inputs)
end

-- define Network
local inputUs = 2; local HUs = 10;local outputUs = 1
net = nn.Sequential()
net:add(nn.Linear(inputUs,HUs))
net:add(nn.Tanh())
net:add(nn.Linear(HUs,outputUs))

-- define criterion
criterion = nn.MSECriterion()

-- transfer revalent data to cuda, including data,net and criterion
-- Here we just use simple layers, so Linear cannot handle CudaTensors.
--trainset = trainset:cuda()
--trainlabels = trainlabels:cuda()
--criterion = criterion:cuda()



-- standard training
    -- first get params and gradParams 
local params, gradParams = net:getParameters()
-- optimState 
local optimState = {learningRate = 0.001}

-- For-loop to call feval
for epoches = 1,1000 do

    local function  feval(params)
        -- this function will be called many times in optim, so make sure every time gradParams must set to zero
        gradParams:zero()

        local predict = net:forward(trainset)
        -- pay attention! criterion:forward or criterion:backward all takes the same input params(predict,target)
        -- criterion:backward return the input params of the first layer in backpropagation.
        local loss = criterion:forward(predict, trainlabels)
        local dloss_dpredict = criterion:backward(predict,trainlabels)

        -- Do have a look! gradParams(inlcuding gradWeights and gradBias) can not be returned in this function.
        -- Torch simplfy the function so that when you have called net:backward, the gradParms have been refreshed!
        -- If you want to see what're the values , you can use getParameters function.
        local gradInput = net:backward(trainset, dloss_dpredict)

        return loss, gradParams
    end

    optim.sgd(feval,params,optimState)

end

-- Test network

local x = torch.Tensor
({
    {0.5,0.5},
    {0.5,-0.5},
    {-0.5,-0.5},
    {-0.5,0.5}
})
-- If x is use print(net:forward(x:cuda()))
print(net:forward(x))

总结

自动挡就是在一个循环中进行不断调用类似optim.sgd(feval, params, optimState)
sgd的原型是:

x*, {f}, ... = optim.method(opfunc, x[, config][, state])

就是说x是需要优化的变量,而opfunc是优化函数句柄,这里我们需要优化的是网络参数,所以传入先前getParameters得到的params。而config主要是一些配置,用来进行优化梯度下降的。包括
learningRate, learningRateDecay, weightDecay, momentum
state一般用不到。。

你可能感兴趣的:(Deep,Learning,Lua,Torch7入门教程)