Lagrange对偶和KKT条件

Lagrange 对偶,原优化问题
min ⁡ f 0 ( x ) s . t . f i ( x ) ≤ 0 h i ( x ) ≤ 0 \begin{aligned} &\min &f_0(x) \\ &s.t. &f_i(x) \le 0\\ &&h_i(x) \le 0\end{aligned} mins.t.f0(x)fi(x)0hi(x)0
对偶函数为
g ( λ , ν ) = inf ⁡ x ∈ D L ( x , λ , ν ) = inf ⁡ x ∈ D ( f 0 ( x ) + ∑ λ i f i ( x ) + ∑ ν i h i ) g(\lambda,\nu)=\inf_{x\in \mathcal{D}} L(x,\lambda,\nu)=\inf_{x\in \mathcal{D}} \left(f_0(x) +\sum \lambda_if_i(x) +\sum \nu_i h_i \right) g(λ,ν)=xDinfL(x,λ,ν)=xDinf(f0(x)+λifi(x)+νihi)
inf ⁡ \inf inf可以取到 − ∞ -\infty
定义 d o m   g = { ( λ , ν ) ∣ g ( λ , ν ) > − ∞ } dom\,g=\{(\lambda,\nu)\mid g(\lambda,\nu) \gt -\infty \} domg={(λ,ν)g(λ,ν)>}
弱对偶性:设 p ⋆ p^\star p为原问题的最优值,对任意 λ ⪰ 0 , ν \lambda \succeq 0,\nu λ0,ν成立
g ( λ , ν ) ≤ p ⋆ g(\lambda,\nu) \le p^\star g(λ,ν)p
d ⋆ = sup ⁡ g ( λ , ν ) ≤ p ⋆ d^\star=\sup g(\lambda,\nu)\le p^\star d=supg(λ,ν)p
p ⋆ − d ⋆ p^\star - d^\star pd称为对偶间隔,如果对偶间隔为0,则称强对偶性成立
强对偶性成立有很多的准则,有一个简单的准则叫Slater条件(链接有证明)
几何理解见:知乎回答
其中直观解释了,当优化问题是凸的时候,凸优化问题的上镜图投影是凸的,而凸集在边界点存在一个支撑超平面。不过我不知道为什么凸优化问题大部分情况下强对偶都成立的原因(从这里貌似看不出)

互补松弛

如果强对偶成立(无论原问题是否凸),则
f 0 ( x ⋆ ) = g ( λ ⋆ , ν ⋆ ) = inf ⁡ ( f 0 ( x ) + ∑ λ i ⋆ f i ( x ) + ∑ ν i ⋆ h i ( x ) ) ≤ f 0 ( x ⋆ ) + ∑ λ i ⋆ f i ( x ⋆ ) + ∑ ν i ⋆ h i ( x ⋆ ) ≤ f 0 ( x ⋆ ) \begin{aligned}f_0(x^\star)&=g(\lambda^\star,\nu^\star)\\ &=\inf \left(f_0(x) +\sum \lambda_i^\star f_i(x) +\sum \nu_i^\star h_i(x) \right)\\ &\le f_0(x^\star) +\sum \lambda_i^\star f_i(x^\star) +\sum \nu_i^\star h_i(x^\star)\\ &\le f_0(x^\star) \end{aligned} f0(x)=g(λ,ν)=inf(f0(x)+λifi(x)+νihi(x))f0(x)+λifi(x)+νihi(x)f0(x)
λ i ⋆ f i ( x ⋆ ) = 0 \lambda_i^\star f_i(x^\star)=0 λifi(x)=0,这个等式就叫互补松弛性

KKT(Karush-Kuhn-Tucher)条件

由互补松弛性,最优值的梯度条件,优化问题的约束,整合到一起就是KKT条件。它是强对偶性成立的必要条件
f i ( x ⋆ ) ≤ 0 h ( x ⋆ ) = 0 λ i ⋆ ≥ 0 λ i ⋆ f i ( x ⋆ ) = 0 ∇ f 0 ( x ⋆ ) + ∑ λ i ⋆ ∇ f i ( x ⋆ ) + ∑ ν i ⋆ ∇ h i ( x ⋆ ) = 0 f_i(x^\star)\le 0\\ h(x^\star)=0\\ \lambda_i^\star \ge 0\\ \lambda_i^\star f_i(x^\star)=0\\\\ \nabla f_0(x^\star) +\sum \lambda_i^\star \nabla f_i(x^\star) +\sum \nu_i^\star \nabla h_i(x^\star)=0 fi(x)0h(x)=0λi0λifi(x)=0f0(x)+λifi(x)+νihi(x)=0

如果原问题为凸,则它也是强对偶的充分条件
因为
g ( λ ⋆ , ν ⋆ ) = L ( x ⋆ , λ ⋆ , ν ⋆ ) = f 0 ( x ⋆ ) + ∑ λ i ⋆ f i ( x ⋆ ) + ∑ ν i ⋆ h i ( x ⋆ ) = f 0 ( x ⋆ ) \begin{aligned}g(\lambda^\star,\nu^\star)&=L(x^\star,\lambda^\star,\nu^\star)\\ &= f_0(x^\star) +\sum \lambda_i^\star f_i(x^\star) +\sum \nu_i^\star h_i(x^\star)\\ &= f_0(x^\star) \end{aligned} g(λ,ν)=L(x,λ,ν)=f0(x)+λifi(x)+νihi(x)=f0(x)

一个强对偶不成立的凸优化例子和几何解释
min ⁡ e − x s . t . x 2 / y ≤ 0 \min e^{-x}\\ s.t.\quad x^2/y \le 0 minexs.t.x2/y0
定义域为 D = { ( x , y ) ∣ y > 0 } \mathcal{D}=\{(x,y)\mid y>0\} D={(x,y)y>0}
L ( x , y , λ ) = e − x + λ x 2 / y L(x,y,\lambda)=e^{-x}+\lambda x^2/y L(x,y,λ)=ex+λx2/y
p ⋆ = 1 , d ⋆ = 0 p^\star=1,d^\star=0 p=1,d=0,对偶间隙为1

Lagrange对偶和KKT条件_第1张图片
可行域只有一个点, g ( λ ) g(\lambda) g(λ)为0,所以 λ u + t = g ( λ ) \lambda u + t=g(\lambda) λu+t=g(λ)不是上镜图的支撑平面

你可能感兴趣的:(数学)