吴恩达机器学习练习2:optimset和fminunc函数

在练习2中使用了到了两个函数:optimset和fminunc。

%  Set options for fminunc
options = optimset('GradObj', 'on', 'MaxIter', 400);
%set the GradObj option to on,which tells fminunc that our function
%returns both the cost and gradient.
%This allows fminunc  to use the gradient when minimizing the function.
%set the MaxIter option to 400, so that fminunc will run for at most 400
%steps before it terminates.

%  Run fminunc to obtain the optimal theta
%  This function will return theta and the cost 
[theta, cost] = ...
	fminunc(@(t)(costFunction(t, X, y)), initial_theta, options);

1、通过help查看optimset函数

>> help optimset
 optimset Create/alter optimization OPTIONS structure.
    OPTIONS = optimset('PARAM1',VALUE1,'PARAM2',VALUE2,...) creates an
    optimization options structure OPTIONS in which the named parameters have
    the specified values.  Any unspecified parameters are set to [] (parameters
    with value [] indicate to use the default value for that parameter when
    OPTIONS is passed to the optimization function). It is sufficient to type
    only the leading characters that uniquely identify the parameter.  Case is
    ignored for parameter names.
    NOTE: For values that are strings, the complete string is required.
 
    OPTIONS = optimset(OLDOPTS,'PARAM1',VALUE1,...) creates a copy of OLDOPTS
    with the named parameters altered with the specified values.
 
    OPTIONS = optimset(OLDOPTS,NEWOPTS) combines an existing options structure
    OLDOPTS with a new options structure NEWOPTS.  Any parameters in NEWOPTS
    with non-empty values overwrite the corresponding old parameters in
    OLDOPTS.
 
    optimset with no input arguments and no output arguments displays all
    parameter names and their possible values, with defaults shown in {}
    when the default is the same for all functions that use that parameter. 
    Use optimset(OPTIMFUNCTION) to see parameters for a specific function.
 
    OPTIONS = optimset (with no input arguments) creates an options structure
    OPTIONS where all the fields are set to [].
 
    OPTIONS = optimset(OPTIMFUNCTION) creates an options structure with all
    the parameter names and default values relevant to the optimization
    function named in OPTIMFUNCTION. For example,
            optimset('fminbnd')
    or
            optimset(@fminbnd)
    returns an options structure containing all the parameter names and
    default values relevant to the function 'fminbnd'.
 
 optimset PARAMETERS for MATLAB
 Display - Level of display [ off | iter | notify | final ]
 MaxFunEvals - Maximum number of function evaluations allowed
                      [ positive integer ]
 MaxIter - Maximum number of iterations allowed [ positive scalar ]
 TolFun - Termination tolerance on the function value [ positive scalar ]
 TolX - Termination tolerance on X [ positive scalar ]
 FunValCheck - Check for invalid values, such as NaN or complex, from 
               user-supplied functions [ {off} | on ]
 OutputFcn - Name(s) of output function [ {[]} | function ] 
           All output functions are called by the solver after each
           iteration.
 PlotFcns - Name(s) of plot function [ {[]} | function ]
           Function(s) used to plot various quantities in every iteration
 
  Note to Optimization Toolbox users:
  To see the parameters for a specific function, check the documentation page 
  for that function. For instance, enter
    doc fmincon
  to open the reference page for fmincon.
 
  You can also see the options in the Optimization Tool. Enter
    optimtool
           
    Examples:
      To create an options structure with the default parameters for FZERO
        options = optimset('fzero');
      To create an options structure with TolFun equal to 1e-3
        options = optimset('TolFun',1e-3);
      To change the Display value of options to 'iter'
        options = optimset(options,'Display','iter');
 
    See also optimget, fzero, fminbnd, fminsearch, lsqnonneg.

    Reference page for optimset
    Other functions named optimset

optimset函数创建或改变最优化OPTIONS结构。
OPTIONS = optimset(‘PARAM1’,VALUE1,‘PARAM2’,VALUE2,…) 创建了一个优化选项结构OPTIONS,当OPTIONS传递优化参数时,将已经命名的参数赋值为特殊值,未有赋值的参数设置为[],即设置为默认值。
OPTIONS = optimset ()会创建一个优选结构OPTIONS其所有的参数设置为[].
OPTIONS = optimset(OPTIMFUNCTION)会创建一个针对优化函数OPTIMFUNCTION的优选结构OPTIONS,其参数为优化函数默认值。
比如说,可以使用语句optimset(‘fminbnd’)或者optimset(@fminbnd),其会返回针对优化函数 'fminbnd’的优选结构,其参数均为优化函数 'fminbnd’默认值。
通过上述分析,可以使用命令optimset(‘fminunc’)查看函数的参数。

>> optimset('fminunc')

ans = 

                   Display: 'final'
               MaxFunEvals: '100*numberofvariables'
                   MaxIter: 400
                    TolFun: 1.0000e-06
                      TolX: 1.0000e-06
               FunValCheck: 'off'
                 OutputFcn: []
                  PlotFcns: []
           ActiveConstrTol: []
                 Algorithm: []
    AlwaysHonorConstraints: []
           DerivativeCheck: 'off'
               Diagnostics: 'off'
             DiffMaxChange: Inf
             DiffMinChange: 0
            FinDiffRelStep: []
               FinDiffType: 'forward'
         GoalsExactAchieve: []
                GradConstr: []
                   GradObj: 'off'
                   HessFcn: []
                   Hessian: 'off'
                  HessMult: []
               HessPattern: 'sparse(ones(numberofvariables))'
                HessUpdate: 'bfgs'
           InitialHessType: 'scaled-identity'
         InitialHessMatrix: []
          InitBarrierParam: []
     InitTrustRegionRadius: []
                  Jacobian: []
                 JacobMult: []
              JacobPattern: []
                LargeScale: 'on'
                  MaxNodes: []
                MaxPCGIter: 'max(1,floor(numberofvariables/2))'
             MaxProjCGIter: []
                MaxSQPIter: []
                   MaxTime: []
             MeritFunction: []
                 MinAbsMax: []
        NoStopIfFlatInfeas: []
            ObjectiveLimit: -1.0000e+20
      PhaseOneTotalScaling: []
            Preconditioner: []
          PrecondBandWidth: 0
            RelLineSrchBnd: []
    RelLineSrchBndDuration: []
              ScaleProblem: []
                   Simplex: []
       SubproblemAlgorithm: []
                    TolCon: []
                 TolConSQP: []
                TolGradCon: []
                    TolPCG: 0.1000
                 TolProjCG: []
              TolProjCGAbs: []
                  TypicalX: 'ones(numberofvariables,1)'
               UseParallel: 0 

故函数options = optimset(‘GradObj’, ‘on’, ‘MaxIter’, 400)为了一个创建名称为options优选结构,其将参数’GradObj’设置为’On’,使用用户创建的梯度函数,将’MaxIter’设置为400,最大迭代次数为400。
2、关于语句[theta, cost] = fminunc(@(t)(costFunction(t, X, y)), initial_theta, options)
使用help fminunc查看fminunc函数的使用方法。

>> help fminunc
 fminunc finds a local minimum of a function of several variables.
    X = fminunc(FUN,X0) starts at X0 and attempts to find a local minimizer
    X of the function FUN. FUN accepts input X and returns a scalar
    function value F evaluated at X. X0 can be a scalar, vector or matrix. 
 
    X = fminunc(FUN,X0,OPTIONS) minimizes with the default optimization
    parameters replaced by values in OPTIONS, an argument created with the
    OPTIMOPTIONS function.  See OPTIMOPTIONS for details. Use the
    SpecifyObjectiveGradient option to specify that FUN also returns a
    second output argument G that is the partial derivatives of the
    function df/dX, at the point X. Use the HessianFcn option to specify
    that FUN also returns a third output argument H that is the 2nd partial
    derivatives of the function (the Hessian) at the point X. The Hessian
    is only used by the trust-region algorithm.
 
    X = fminunc(PROBLEM) finds the minimum for PROBLEM. PROBLEM is a
    structure with the function FUN in PROBLEM.objective, the start point
    in PROBLEM.x0, the options structure in PROBLEM.options, and solver
    name 'fminunc' in PROBLEM.solver. Use this syntax to solve at the 
    command line a problem exported from OPTIMTOOL. 
 
    [X,FVAL] = fminunc(FUN,X0,...) returns the value of the objective 
    function FUN at the solution X.
 
    [X,FVAL,EXITFLAG] = fminunc(FUN,X0,...) returns an EXITFLAG that
    describes the exit condition. Possible values of EXITFLAG and the
    corresponding exit conditions are listed below. See the documentation
    for a complete description.
 
      1  Magnitude of gradient small enough. 
      2  Change in X too small.
      3  Change in objective function too small.
      5  Cannot decrease function along search direction.
      0  Too many function evaluations or iterations.
     -1  Stopped by output/plot function.
     -3  Problem seems unbounded. 
    
    [X,FVAL,EXITFLAG,OUTPUT] = fminunc(FUN,X0,...) returns a structure 
    OUTPUT with the number of iterations taken in OUTPUT.iterations, the 
    number of function evaluations in OUTPUT.funcCount, the algorithm used 
    in OUTPUT.algorithm, the number of CG iterations (if used) in
    OUTPUT.cgiterations, the first-order optimality (if used) in
    OUTPUT.firstorderopt, and the exit message in OUTPUT.message.
 
    [X,FVAL,EXITFLAG,OUTPUT,GRAD] = fminunc(FUN,X0,...) returns the value 
    of the gradient of FUN at the solution X.
 
    [X,FVAL,EXITFLAG,OUTPUT,GRAD,HESSIAN] = fminunc(FUN,X0,...) returns the 
    value of the Hessian of the objective function FUN at the solution X.
 
    Examples
      FUN can be specified using @:
         X = fminunc(@myfun,2)
 
    where myfun is a MATLAB function such as:
 
        function F = myfun(x)
        F = sin(x) + 3;
 
      To minimize this function with the gradient provided, modify
      the function myfun so the gradient is the second output argument:
         function [f,g] = myfun(x)
          f = sin(x) + 3;
          g = cos(x);
      and indicate the gradient value is available by creating options with
      OPTIONS.SpecifyObjectiveGradient set to true (using OPTIMOPTIONS):
         options = optimoptions('fminunc','SpecifyObjectiveGradient',true);
         x = fminunc(@myfun,4,options);
 
      FUN can also be an anonymous function:
         x = fminunc(@(x) 5*x(1)^2 + x(2)^2,[5;1])
 
    If FUN is parameterized, you can use anonymous functions to capture the
    problem-dependent parameters. Suppose you want to minimize the 
    objective given in the function myfun, which is parameterized by its 
    second argument c. Here myfun is a MATLAB file function such as
 
      function [f,g] = myfun(x,c)
 
      f = c*x(1)^2 + 2*x(1)*x(2) + x(2)^2; % function
      g = [2*c*x(1) + 2*x(2)               % gradient
           2*x(1) + 2*x(2)];
 
    To optimize for a specific value of c, first assign the value to c. 
    Then create a one-argument anonymous function that captures that value 
    of c and calls myfun with two arguments. Finally, pass this anonymous 
    function to fminunc:
 
      c = 3;                              % define parameter first
      options = optimoptions('fminunc','SpecifyObjectiveGradient',true); % indicate gradient is provided 
      x = fminunc(@(x) myfun(x,c),[1;1],options)
 
    See also optimoptions, fminsearch, fminbnd, fmincon, @, inline.

    Reference page for fminunc

>> 

fminunc是一个寻找局部最小值函数。
X = fminunc(FUN,X0) 表示从X0开始并尝试通过函数FUN寻找局部最小值. FUN接受输入参数X并返回一个在点X计算出来的标量,X0可以是标量、矢量和矩阵。
X = fminunc(FUN,X0,OPTIONS)使用了优化参数OPTIONS而不是默认参数。
例如FUN可以是匿名函数:
X = fminunc(@myfun,2)
function F = myfun(x)
F = sin(x) + 3;
为了使用提供的梯度函数,修改myfun函数,增加其输出第二参数为梯度。并增加优选结构options。
function [f,g] = myfun(x)
f = sin(x) + 3;
g = cos(x);
options = optimoptions(‘fminunc’,‘SpecifyObjectiveGradient’,true);
x = fminunc(@myfun,4,options);

综上,理解语句[theta, cost] = fminunc(@(t)(costFunction(t, X, y)), initial_theta, options)的意思为:
(1)@(t)(costFunction(t, X, y)为匿名函数,原本函数costFunction为输入三个参数theta,X,y输出两个参数J,gradient,通过匿名函数后,其输入参数变成一个t,另外两个参数X,y变成固定值(X,y前面已经定义),故该表达符合fminunc函数的第一个输入参数,即输入为函数,且该函数仅有一个输入,输出第一个值为计算的最小值,输出第二个值为用户自定义梯度计算值;
(2)initial_theta为寻找局部最小值的起始位置;
(3)options为函数fminunc的优化配置,即最大迭代次数为400,用户梯度函数设置为On;
(4)根据解释[X,FVAL] = fminunc(FUN,X0,…) returns the value of the objective function FUN at the solution X。
[theta, cost] = fminunc(@(t)(costFunction(t, X, y)), initial_theta, options)表示的是返回函数costFunction的局部最小值,其在点theta处,函数costFunction在theta处值为cost。

你可能感兴趣的:(机器学习)