人工智能教程 - 数学基础课程1.7 - 最优化方法4-7 最优化思路第二步核心,约束条件,KKT

最优化思路第二步核心

Determination of a search direction d k d_k dk:

  • Basis of the methods: Taylor series of f(x) in a small vicinity surrounding point x k x_k xk:
    f ( x k + δ ) = f ( x k ) + ▽ T f ( x k ) . δ + 1 2 δ T . ▽ T . ▽ 2 f ( x k ) . δ + O ( ∣ ∣ δ ∣ ∣ 3 ) \LARGE f(x_k+\delta)=f(x_k)+\bigtriangledown ^Tf(x_k).\delta +\frac{1}{2}\delta ^T. \bigtriangledown^T.\bigtriangledown ^2f(x_k).\delta+O(||\delta ||^3) f(xk+δ)=f(xk)+Tf(xk).δ+21δT.T.2f(xk).δ+O(δ3)
  • steepest descent method: d k = − ▽ f ( x k 1. i . e . , d k = − g ( x k ) ) d_k=-\bigtriangledown f(x_k1.i.e.,d_k=-g(x_k)) dk=f(xk1.i.e.,dk=g(xk))
  • Newton method : d k = − H − 1 ( x k ) g ( x k ) d_k= -H^{-1}(x_k)g(x_k) dk=H1(xk)g(xk)

约束条件,KKT

Constraint Optimization: The Karush-Kuhn-Tucker(KKT) Conditions

很多情况下,什么也没有,无参考/有约束条件是没有意义的。

  • The constrained optimization problem
    minimize f(x)
    subject to: a i ( x ) = 0 a_i(x)=0 ai(x)=0 for i = 1,2, …p
    c j ( x ) ≥ 0 c_j(x)\geq0 cj(x)0 for j = 1,2, …p

  • Feasible region{ x : a i ( x ) = 0 x:a_i(x)=0 x:ai(x)=0 for i = 1,2, …p
    ,and c j ( x ) ≥ 0 c_j(x)\geq0 cj(x)0 for j = 1,2, …p}

  • A point x is said to be feasible if it is in the feasible region (inside or on the boundary). We say an
    inequality constraint c j ( x ) ≥ 0 c_j(x) ≥ 0 cj(x)0 is active at a feasible point x, if c j ( x ) = 0 c_j(x) = 0 cj(x)=0. And inequality constraint c j ( x ) ≥ 0 c_j(x) ≥ 0 cj(x)0 is said to be inactive at a feasible point x if c j ( x ) > 0 c_j(x) > 0 cj(x)>0

    但求一维图的最小值是没有意义的,但是如果可以约束为[a,b]区间内的最小值,则有意义。

  • KKT conditions (first-order necessary conditions for point x ∗ x^* x to be a minimum point):
    If x ∗ x^* x is a local minimizer of the constrained problem, then
    (a) a i ( x ∗ ) = 0 a_i(x^*)=0 ai(x)=0 for 1,2,…,p
    (b) a j ( x ∗ ) = 0 a_j(x^*)=0 aj(x)=0 for 1, 2,…, q
    (c ) there exist Lagrange multipliers λ i ∗ \lambda_i^* λi for 1 ≤ i ≤ p 1\leq i\leq p 1ip and μ i ∗ \mu_i^* μi for 1 ≤ j ≤ q 1\leq j\leq q 1jq such that
    ▽ f ( x ∗ ) = ∑ i = 1 p λ i ∗ ▽ a i ( x ∗ ) + ∑ j = 1 q μ j ∗ ▽ c i ( x ∗ ) \LARGE\color{red}\bigtriangledown f(x^*)=\sum_{i=1}^p\lambda _i^*\bigtriangledown a_i(x^*)+\sum_{j=1}^q\mu _j^*\bigtriangledown c_i(x^*) f(x)=i=1pλiai(x)+j=1qμjci(x)
    (d) μ j ∗ c j ( x ∗ ) = 0 \mu _j^* c_j(x^*)=0 μjcj(x)=0 for 1, 2,…,q
    (e) μ j ∗ ≥ 0 \mu _j^*\geq 0 μj0 for 1, 2,…, j μ j q
    Ex:
    人工智能教程 - 数学基础课程1.7 - 最优化方法4-7 最优化思路第二步核心,约束条件,KKT_第1张图片
    f ( x 1 , x 2 ) = − x 1 + x 2 f(x_1,x_2)=-x_1+x_2 f(x1,x2)=x1+x2 无意义
    加上一下三个约束条件,则有意义了:

  1. c 1 ( x ) = x 1 ≥ 0 c_1(x)= x_1\geq 0 c1(x)=x10
  2. c 2 ( x ) = x 2 ≥ 0 c_2(x)= x_2\geq 0 c2(x)=x20 (active)
  3. x 2 = − x 1 + 1 x_2 = -x_1+1 x2=x1+1
    c 3 ( x ) = − x 1 − x 2 + 1 ≥ 0 c_3(x)= -x_1-x_2+1\geq 0 c3(x)=x1x2+10 (active)

x ∗ = [ 1 0 ] x^*=\begin{bmatrix} 1\\ 0 \end{bmatrix} x=[10]

▽ f = [ − 1 0 ] = 0 [ 1 0 ] + 2 [ 0 1 ] + 1 [ − 1 − 1 ] \bigtriangledown f=\begin{bmatrix} -1\\ 0 \end{bmatrix}=0\begin{bmatrix} 1\\ 0 \end{bmatrix}+2\begin{bmatrix} 0\\ 1 \end{bmatrix}+1\begin{bmatrix} -1\\ -1 \end{bmatrix} f=[10]=0[10]+2[01]+1[11]

你可能感兴趣的:(数学基础课程)