Karush-Kuhn-Tucker条件

1. Lagrange算子和对偶

给定最小化问题
min ⁡ x ∈ R n f ( x )    ,       subject to       h ( x ) ≤ 0 ,   l ( x ) = 0 \mathop{\min}\limits_{x\in\mathbf{R}^n}f(x) \;,\;\;\text{ subject to }\;\; h(x)\leq0,\, l(x)=0 xRnminf(x), subject to h(x)0,l(x)=0为了处理简单同时更便于洞察到问题本质,这里只假定存在单个限制函数。
定义 Lagrange对偶函数 为:
g ( u , v ) = min ⁡ x ∈ R n L ( x , u , v ) = min ⁡ x ∈ R n { f ( x ) + u h ( x ) + v l ( x ) } g(u,v)=\mathop{\min}\limits_{x\in\mathbf{R}^n}L(x,u,v)=\mathop{\min}\limits_{x\in\mathbf{R}^n}\{f(x)+uh(x)+vl(x)\} g(u,v)=xRnminL(x,u,v)=xRnmin{f(x)+uh(x)+vl(x)} 则原问题的对偶问题是
max ⁡ u , v g ( u , v )    ,       subject to       u ≥ 0 , v ≥ 0 \mathop{\max}\limits_{u,v}g(u,v) \;,\;\;\text{ subject to }\;\; u\ge 0, v\ge 0 u,vmaxg(u,v), subject to u0,v0 性质

(1) 对偶问题是凸问题(也就是说,不管原函数 f f f 是否为凸函数,对偶函数 g g g 都是凸函数);

(2) 弱对偶:假设 f ∗ f^* f g ∗ g^* g 分别是原问题和对偶问题的最优值,则 g ∗ ≤ f ∗ g^*\leq f^* gf

事实上,假设 ( u ∗ , v ∗ ) (u^*,v^*) (u,v) 是对偶问题的最优解,则 g ∗ = g ( u ∗ , v ∗ ) = min ⁡ x ∈ R n L ( x , u ∗ , v ∗ ) ≤ min ⁡ x ∈ R n f ( x ) = f ∗ g^*=g(u^*,v^*)=\mathop{\min}\limits_{x\in\mathbf{R}^n}L(x,u^*,v^*)\leq \mathop{\min}\limits_{x\in\mathbf{R}^n} f(x) =f^* g=g(u,v)=xRnminL(x,u,v)xRnminf(x)=f 其中的 ≤ \leq 是由于原问题的限制条件。

(3) 强对偶:假设原问题是凸问题,且存在 strictly feasible point, 也就是说存在 x ′ x' x 使得 h ( x ′ ) < 0 , l ( x ′ ) = 0 h(x')<0,l(x')=0 h(x)<0,l(x)=0,则 f ∗ = g ∗ f^*= g^* f=g如何证明???

(4) 给定原问题的可行点 x x x 和对偶问题的可行点 u , v u,v u,v,定义对偶间距(duality gap)为: D ( x , u , v ) = f ( x ) − g ( u , v ) D(x,u,v)=f(x)-g(u,v) D(x,u,v)=f(x)g(u,v) 如果 D ( x , u , v ) = 0 D(x,u,v)=0 D(x,u,v)=0,则 x x x 是原问题的最优解, u , v u,v u,v 是对偶问题的最优解

Proof. 因为 g ∗ ≤ f ∗ g^*\leq f^* gf,所以 f ( x ) − f ∗ ≤ f ( x ) − g ∗ ≤ f ( x ) − g ( u , v ) = D ( x , u , v ) f(x)-f^*\leq f(x)-g^*\leq f(x)-g(u,v)=D(x,u,v) f(x)ff(x)gf(x)g(u,v)=D(x,u,v),进而 f ( x ) = f ∗ , g ( u , v ) = g ∗ f(x)=f^*, g(u,v)=g^* f(x)=f,g(u,v)=g □ \Box

Remark. 对偶间距 D ( x , u , v ) D(x,u,v) D(x,u,v) 可用于优化收敛算法:如果 D ( x , u , v ) < ϵ D(x,u,v)<\epsilon D(x,u,v)<ϵ,则 f ( x ) − f ∗ < ϵ f(x)-f^*<\epsilon f(x)f<ϵ

2. KKT条件

这里我们用一般的形式
min ⁡ x ∈ R n f ( x ) \mathop{\min}\limits_{x\in\mathbf{R}^n}f(x) xRnminf(x) subject to      h i ( x ) ≤ 0 ,   i = 1 , ⋯   , s \text{subject to}\;\;\qquad h_i(x)\leq0,\, i=1,\cdots,s\qquad\qquad\qquad subject tohi(x)0,i=1,,s      l j ( x ) = 0 ,   j = 1 , ⋯   , t \;\;l_j(x)=0,\,j=1,\cdots,t lj(x)=0,j=1,,t 定义 Lagrange函数:
L ( x , u , v ) = f ( x ) + u T h ( x ) + v T l ( x ) \mathcal{L}(x,u,v)=f(x)+u^Th(x)+v^Tl(x) L(x,u,v)=f(x)+uTh(x)+vTl(x) 这里 u ∈ R + s , v ∈ R + t , h = ( h 1 , ⋯   , h s ) , l = ( l 1 , ⋯   , l t ) . u\in\mathbf{R}^s_{+}, v\in\mathbf{R}^t_+, h=(h_1,\cdots,h_s), l=(l_1,\cdots,l_t). uR+s,vR+t,h=(h1,,hs),l=(l1,,lt).

定理. x ∗ x^* x 是原问题的最优解,当且仅当存在唯一的 u ∗ , v ∗ u^*,v^* u,v,使得 KKT条件成立:

  • stationarity: ∇ x L ( x ∗ , u ∗ , v ∗ ) = 0 \nabla_x\mathcal{L}(x^*,u^*,v^*)=0 xL(x,u,v)=0
  • complementary slackness: u i h i ( x ∗ ) = 0 u_ih_i(x^*)=0 uihi(x)=0
  • primal feasiblity: h i ( x ∗ ) ≤ 0 , l j ( x ∗ ) = 0 h_i(x^*)\leq 0, l_j(x^*)=0 hi(x)0,lj(x)=0
  • dual feasiblity: u i ≥ 0 u_i\ge 0 ui0

Example. P8 in Robert P. Rooderkerk, Harald J. van Heerde, Robust Optimization of the 0-1 Knapsack Problem: Balancing Risk and Return in Assortment Optimization, European Journal of Operational Research.
( RAO Robust ) (\text{RAO}_{\text{Robust}}) (RAORobust) max ⁡ x min ⁡ p ∈ U ∑ k = 1 n p k x k \mathop{\max}\limits_{x}\mathop{\min}\limits_{p\in U}\mathop{\sum}\limits_{k=1}^{n}p_kx_k xmaxpUmink=1npkxk subject to  ∑ k = 1 n w k x k ≤ c ,    x k ∈ { 0 , 1 } ,      k = 1 , ⋯   , n \text{subject to }\qquad \mathop{\sum}\limits_{k=1}^{n}w_kx_k\leq c, \;x_k\in\{0,1\},\;\;k=1,\cdots,n subject to k=1nwkxkc,xk{0,1},k=1,,n with the uncertertainty set defined as follows: U = { p ∣ ( p − p ˉ ) T Θ − 1 ( p − p ˉ ) ≤ r 2 } U=\left\{p|(p-\bar{p})^T\Theta^{-1}(p-\bar{p})\leq r^2\right\} U={p(ppˉ)TΘ1(ppˉ)r2} where Θ \Theta Θ is a symmetric positive definite matrix with elements θ k k ′ \theta_{kk'} θkk. Then ( RAO Robust ) (\text{RAO}_{\text{Robust}}) (RAORobust) can be rewrite as follows: max ⁡ x ∑ k = 0 n p ˉ k x k − r ∑ k = 1 n ∑ k ′ = 1 n θ k k ′ x k x k ′ \mathop{\max}\limits_{x}\mathop{\sum}\limits_{k=0}^{n}\bar{p}_kx_k-r\sqrt{\mathop{\sum}\limits_{k=1}^{n}\mathop{\sum}\limits_{k'=1}^{n}\theta_{kk'}x_kx_{k'}} xmaxk=0npˉkxkrk=1nk=1nθkkxkxk subject to  ∑ k = 1 n w k x k ≤ c ,    x k ∈ { 0 , 1 } ,      k = 1 , ⋯   , n \text{subject to }\qquad \mathop{\sum}\limits_{k=1}^{n}w_kx_k\leq c, \;x_k\in\{0,1\},\;\;k=1,\cdots,n subject to k=1nwkxkc,xk{0,1},k=1,,n Proof. We only need to consider the pertubation part min ⁡ p ∈ U ∑ k = 1 n p ~ k x k \mathop{\min}\limits_{p\in U}\mathop{\sum}\limits_{k=1}^{n}\tilde{p}_kx_k pUmink=1np~kxk, define f ( p ~ ) = p ~ T x      ,        h ( p ~ ) = p ~ T Θ − 1 p ~ − r 2 f(\tilde{p})=\tilde{p}^Tx\;\;,\;\;\;h(\tilde{p})=\tilde{p}^T\Theta^{-1}\tilde{p}-r^2 f(p~)=p~Tx,h(p~)=p~TΘ1p~r2 and then Lagrange: L ( p ~ , u ) = p ~ T x + u ( p ~ T Θ − 1 p ~ − r 2 ) \mathcal{L}(\tilde{p},u)=\tilde{p}^Tx+u\left(\tilde{p}^T\Theta^{-1}\tilde{p}-r^2\right) L(p~,u)=p~Tx+u(p~TΘ1p~r2) Using the KKT conditions, if p ~ ∗ , u ∗ \tilde{p}^*,u^* p~,u is the optimal solutions, we have
(1) ∇ p ~ L ( p ~ ∗ , u ∗ ) = x + 2 u ∗ Θ − 1 p ~ ∗ = 0 \nabla_{\tilde{p}}\mathcal{L}(\tilde{p}^*,u^*)=x+2u^*\Theta^{-1}\tilde{p}^*\tag{1}=0 p~L(p~,u)=x+2uΘ1p~=0(1) (2) p ~ ∗ T Θ − 1 p ~ ∗ = r 2 {\tilde{p}^*}^T\Theta^{-1}\tilde{p}^*=r^2\tag{2} p~TΘ1p~=r2(2) From (1), we conclude p ~ ∗ = − 1 2 u ∗ Θ x \tilde{p}^*=-\frac{1}{2u^*}\Theta x p~=2u1Θx, together with (2) we get 1 2 u ∗ = r 2 x T Θ x \frac{1}{2u^*}=\sqrt{\frac{r^2}{{x}^T\Theta x}} 2u1=xTΘxr2 , so
min ⁡ p ∈ U ∑ k = 1 n p ~ k x k = x T p ~ ∗ = − r 2 x T Θ x x T Θ x = − r x T Θ x \mathop{\min}\limits_{p\in U}\mathop{\sum}\limits_{k=1}^{n}\tilde{p}_kx_k ={x}^T\tilde{p}^*=-\sqrt{\frac{r^2}{{x}^T\Theta x}}{x}^T\Theta x=-r\sqrt{{x}^T\Theta x} pUmink=1np~kxk=xTp~=xTΘxr2 xTΘx=rxTΘx i.e. min ⁡ p ∈ U ∑ k = 1 n p k x k = ∑ k = 0 n p ˉ k x k − r ∑ k = 1 n ∑ k ′ = 1 n θ k k ′ x k x k ′ . □ \qquad\mathop{\min}\limits_{p\in U}\mathop{\sum}\limits_{k=1}^{n}p_kx_k=\mathop{\sum}\limits_{k=0}^{n}\bar{p}_kx_k-r\sqrt{\mathop{\sum}\limits_{k=1}^{n}\mathop{\sum}\limits_{k'=1}^{n}\theta_{kk'}x_kx_{k'}}.\qquad\qquad\Box pUmink=1npkxk=k=0npˉkxkrk=1nk=1nθkkxkxk .

你可能感兴趣的:(数学)