(1) min x    c T x s . t .    A x = b x ≥ 0 \min_x \;c^Tx\\s.t. \;Ax=b\\x\geq 0 \tag{1} xmincTxs.t.Ax=bx≥0(1)
(2) max y    b T y s . t .    A T y + s = c s ≥ 0 \max_y\;b^Ty\\s.t.\;A^Ty+s=c\\s\geq 0\tag{2} ymaxbTys.t.ATy+s=cs≥0(2)
A ∈ R m × n , x ∈ R n , s ∈ R n , y ∈ R m A \in \R^{m\times n}, x \in \R^{n}, s \in \R^{n}, y \in \R^{m} A∈Rm×n,x∈Rn,s∈Rn,y∈Rm
考虑优化问题: min x f ( x ) = g ( x ) + h ( x ) \min_x\quad f(x) = g(x) + h(x) xminf(x)=g(x)+h(x)其中 g ( x ) g(x) g(x) 和 h ( x ) h(x) h(x) 均为闭凸函数。
从任意初始点 z ( 0 ) z^{(0)} z(0), 重复以下步骤: x k = p r o x t h ( z k − 1 ) x^{k} = prox_{th}(z^{k-1}) xk=proxth(zk−1) (3) y k = p r o x t g ( 2 x k − z k − 1 ) y^{k} = prox_{tg}(2x^k-z^{k-1}) \tag{3} yk=proxtg(2xk−zk−1)(3) z k = z k − 1 + y k − x k z^{k} = z^{k-1} + y^{k} - x^{k} zk=zk−1+yk−xk 直至收敛。
注:
p r o x f ( x ) = a r g min u f ( u ) + 1 2 ∣ ∣ u − x ∣ ∣ 2 2 prox_f(x) = arg\min_u\quad f(u) + \frac{1}{2}||u-x||_2^2 proxf(x)=arguminf(u)+21∣∣u−x∣∣22
p r o x t f ( x ) = a r g min u t f ( u ) + 1 2 ∣ ∣ u − x ∣ ∣ 2 2 = a r g min u f ( u ) + 1 2 t ∣ ∣ u − x ∣ ∣ 2 2 prox_{tf}(x) = arg\min_u\quad tf(u) + \frac{1}{2}||u-x||_2^2 \\ = arg\min_u\quad f(u) + \frac{1}{2t}||u-x||_2^2 proxtf(x)=argumintf(u)+21∣∣u−x∣∣22=arguminf(u)+2t1∣∣u−x∣∣22
言归正传,针对线性规划的原问题,将其写成无约束优化问题: min x c T x + I A x = b ( x ) + I x ≥ 0 ( x ) \min_x \quad c^Tx + \mathbb{I}_{Ax=b}(x) + \mathbb{I}_{x\geq 0}(x) xmincTx+IAx=b(x)+Ix≥0(x)其中 I S \mathbb{I}_S IS 表示集合 S S S 的示性函数,即 I S ( x ) = { 0 , x ∈ S ∞ , o t h e r w i s e \mathbb{I}_S(x)=\left\{ \begin{array}{lr} 0 ,\quad x \in S& \\ \infty, \quad otherwise& \end{array} \right. IS(x)={0,x∈S∞,otherwise因此可以将目标函数拆分成两部分: g ( x ) = c T x + I A x = b ( x ) h ( x ) = I x ≥ 0 ( x ) g(x) = c^Tx+\mathbb{I}_{Ax=b}(x) \\ h(x) = \mathbb{I}_{x\geq 0}(x) g(x)=cTx+IAx=b(x)h(x)=Ix≥0(x)然后利用 Douglas-Rachford 迭代即可求出问题的解 x x x。
下面来看具体细节: p r o x t g prox_{tg} proxtg 和 p r o x t h prox_{th} proxth 解析表达式是什么?
先来看简单的: p r o x t h ( x ) = a r g min u h ( u ) + 1 2 t ∣ ∣ u − x ∣ ∣ 2 2 prox_{th}(x) = arg\min_u\quad h(u) + \frac{1}{2t}||u-x||_2^2 proxth(x)=arguminh(u)+2t1∣∣u−x∣∣22也就是说 p r o x t h ( x ) prox_{th}(x) proxth(x) 是在固定 x x x 时下述问题的解: min u 1 2 t ∣ ∣ u − x ∣ ∣ 2 2 s . t . u ≥ 0 \min_u \quad \frac{1}{2t}||u-x||_2^2 \\ s.t. \quad u\geq 0 umin2t1∣∣u−x∣∣22s.t.u≥0说白了,就是 x x x 在第一象限的投影!所以 (4) p r o x t h ( x ) = m a x ( x , 0 ) prox_{th}(x) = max(x,0) \tag{4} proxth(x)=max(x,0)(4)
接下来稍复杂, p r o x t g ( x ) prox_{tg}(x) proxtg(x) 是在固定 x x x 时下述问题的解: min u c T u + 1 2 t ∣ ∣ u − x ∣ ∣ 2 2 s . t . A u = b \min_u \quad c^Tu + \frac{1}{2t}||u-x||_2^2 \\ s.t. \quad Au=b umincTu+2t1∣∣u−x∣∣22s.t.Au=b
写出拉格朗日函数: L ( u , λ ) = c T u + 1 2 t ∣ ∣ u − x ∣ ∣ 2 2 + λ T ( A u − b ) L(u,\lambda) = c^Tu + \frac{1}{2t}||u-x||_2^2 +\lambda^T(Au-b) L(u,λ)=cTu+2t1∣∣u−x∣∣22+λT(Au−b)由最优性条件: 0 = ∇ u L = c + A T λ + 1 t ( u − x ) ⇒ u = x − t ( c + A T λ ) 0 = \nabla_uL = c+A^T\lambda + \frac{1}{t}(u-x) \\ \Rightarrow u = x - t(c+A^T\lambda) 0=∇uL=c+ATλ+t1(u−x)⇒u=x−t(c+ATλ)代入约束条件: 0 = A u − b = A x − t A c − t A A T λ − b ⇒ λ = ( t A A T ) − 1 ( A ( x − t c ) − b ) 0 = Au-b = Ax - tAc - tAA^T\lambda - b \\ \Rightarrow \lambda = (tAA^T)^{-1} \big( A(x-tc) - b \big) 0=Au−b=Ax−tAc−tAATλ−b⇒λ=(tAAT)−1(A(x−tc)−b)解得 u = x − t ( c + A T λ ) = x − t ( c + A T ( t A A T ) − 1 ( A ( x − t c ) − b ) ) u = x - t(c+A^T\lambda)=x - t\bigg(c+A^T(tAA^T)^{-1} \big( A(x-tc) - b \big)\bigg) u=x−t(c+ATλ)=x−t(c+AT(tAAT)−1(A(x−tc)−b))即 (5) p r o x t g ( x ) = x − t ( c + A T ( t A A T ) − 1 ( A ( x − t c ) − b ) ) prox_{tg}(x) = x - t\bigg(c+A^T(tAA^T)^{-1} \big( A(x-tc) - b \big)\bigg) \tag{5} proxtg(x)=x−t(c+AT(tAAT)−1(A(x−tc)−b))(5)
将 (4)(5) 代入 Douglas-Rachford 迭代(3),便是求解的全过程。
上代码!
function [x] = LP_DRS_primal(c, A, b, opts, x0)
m = size(A,1);
n = size(A,2);
z = randn(n,1); %随机初始化
x = x0; % 随机初始化
S = A*A';
U = chol(S);
L = U'; %cholesky decomposition: S = L*U = U'*U
t = 0.1; % 超参数
prox_th = @(x) max(x,0);
prox_tg = @(x) x-t*(c+A'*((t*U)\(L\(A*(x-t*c)-b))));
err = 1;
x_old = x;
while(err > 1e-6)
x = prox_th(z);
y = prox_tg(2*x-z);
z = z + y - x;
err = norm(x-x_old);
x_old = x;
end
end
看效果!
% 生成数据
n = 100;
m = 20;
A = rand(m,n);
xs = full(abs(sprandn(n,1,m/n)));
b = A*xs;
y = randn(m,1);
s = rand(n,1).*(xs==0);
c = A'*y + s;
% 计算误差
errfun = @(x1, x2) norm(x1-x2)/(1+norm(x1));
% 标准答案
figure(1);
subplot(2,1,1);
stem(xs,'fill','k-.')
title('exact solu');
% DRS 求解
opts = [];
tic;
[x1] = LP_DRS_primal(c, A, b, opts, x0);
t1 = toc;
subplot(2,1,2);
stem(x1,'fill','k-.');
title('lp_drs_primal');
fprintf('lp-drs-primal: cpu: %5.2f, err-to-exact: %3.2e\n', t2, errfun(x1, xs));
lp-drs-primal: cpu: 0.14, err-to-exact: 2.92e-04