这篇博客看完以后接着看下一篇博客添加链接描述专门介绍MINRES算法实现就容易了
首先介绍Lanczos分解,Lanzos把对称矩阵转换为一个三对角对称矩阵。考虑三对角对称矩阵如下,考虑正交分解
T = Q T A Q T = Q^T A Q T=QTAQ
T = ( α 1 β 1 0 ⋯ 0 0 β 1 α 2 β 2 0 ⋯ 0 0 β 2 α 3 β 3 ⋯ 0 0 ⋯ ⋯ ⋯ ⋯ 0 0 ⋯ 0 β n − 2 α n − 1 β n − 1 0 0 ⋯ 0 β n − 1 α n ) T=\left(\begin{array}{cccccc} \alpha_1 & \beta_1 & 0 & \cdots & 0 & 0\\ \beta_1 & \alpha_2 & \beta_2 & 0 & \cdots & 0 \\ 0 & \beta_2 & \alpha_3 & \beta_3 & \cdots & 0\\ 0 & \cdots & \cdots & \cdots & \cdots & 0\\ 0 & \cdots & 0 & \beta_{n-2} & \alpha_{n-1} & \beta_{n-1}\\ 0 & 0 & \cdots & 0 & \beta_{n-1} & \alpha_n \end{array}\right) T=⎝⎜⎜⎜⎜⎜⎜⎛α1β10000β1α2β2⋯⋯00β2α3⋯0⋯⋯0β3⋯βn−200⋯⋯⋯αn−1βn−10000βn−1αn⎠⎟⎟⎟⎟⎟⎟⎞
下面重点考虑正交矩阵 Q Q Q和三对角矩阵 T T T的形成,我们记 Q = [ q 1 , q 2 , … , q n ] , q i ∈ R n Q=[q_1,q_2,\ldots,q_n],q_i \in R^n Q=[q1,q2,…,qn],qi∈Rn,则根据 Q T = A Q QT = AQ QT=AQ,我们会得到下面这个等式,约定 β 0 q 0 = β n q n = 0 \beta_0 q_0 = \beta_n q_n = 0 β0q0=βnqn=0:
A q i = β i − 1 q i − 1 + α i q i + β i q i + 1 , 1 ≤ i ≤ n . Aq_i = \beta_{i - 1}q_{i - 1} + \alpha_i q_i + \beta_i q_{i + 1},1 \leq i \leq n. Aqi=βi−1qi−1+αiqi+βiqi+1,1≤i≤n.
我们先考虑 α 1 , β 1 \alpha_1,\beta_1 α1,β1的确定,任意取一个向量 q 1 ∈ R n , ∥ q 1 ∥ 2 = 1 q_1 \in R^n, \| q_1 \|_2 = 1 q1∈Rn,∥q1∥2=1,则有
{ A q 1 = α 1 q 1 + β 1 q 2 , α 1 = q 1 T A q 1 , β 1 = ∥ A q 1 − α 1 q 1 ∥ 2 , q 2 = ( A q 1 − α 1 q 1 ) / β 1 . \left\{\begin{array}{l} Aq_1 = \alpha_1 q_1 + \beta_1 q_2,\\ \alpha_1 = q_1^T A q_1,\\ \beta_1 = \|Aq_1 - \alpha_1 q_1\|_2,\\ q_2 = (Aq_1 - \alpha_1 q_1)/\beta_1. \end{array}\right. ⎩⎪⎪⎨⎪⎪⎧Aq1=α1q1+β1q2,α1=q1TAq1,β1=∥Aq1−α1q1∥2,q2=(Aq1−α1q1)/β1.
类似地,假设我们已经得到了 [ q 1 , q 2 , … , q k ] [q_1,q_2,\ldots,q_k] [q1,q2,…,qk]和 [ α 1 , α 2 , … , α k − 1 ] , [ β 1 , β 2 , … , β k − 1 ] [\alpha_1,\alpha_2,\ldots,\alpha_{k - 1}],[\beta_1,\beta_2,\ldots,\beta_{k-1}] [α1,α2,…,αk−1],[β1,β2,…,βk−1],下面同样可以类似得到 α k , β k , q k + 1 \alpha_k,\beta_k,q_{k+1} αk,βk,qk+1,
{ A q k = β k − 1 q k − 1 + α k q k + β k q k + 1 , α k = q k T ( A q k − β k − 1 q k − 1 ) , β k = ∥ A q k − β k − 1 q k − 1 ∥ 2 , q k + 1 = ( A q k − β k − 1 q k − 1 ) / β k . \left\{\begin{array}{l} Aq_k = \beta_{k - 1}q_{k - 1} + \alpha_k q_k + \beta_k q_{k+1},\\ \alpha_k = q_k^T (Aq_k - \beta_{k - 1}q_{k - 1}),\\ \beta_k = \|Aq_k - \beta_{k - 1}q_{k - 1}\|_2,\\ q_{k+1} = (Aq_k - \beta_{k - 1}q_{k - 1})/\beta_k. \end{array}\right. ⎩⎪⎪⎨⎪⎪⎧Aqk=βk−1qk−1+αkqk+βkqk+1,αk=qkT(Aqk−βk−1qk−1),βk=∥Aqk−βk−1qk−1∥2,qk+1=(Aqk−βk−1qk−1)/βk.
引入记号 Q k = [ q 1 , q 2 , … , q k ] Q_k = [q_1,q_2,\ldots,q_k] Qk=[q1,q2,…,qk],和 T k T_k Tk,其中有
T k = ( α 1 β 1 0 ⋯ 0 0 β 1 α 2 β 2 0 ⋯ 0 0 β 2 α 3 β 3 ⋯ 0 0 ⋯ ⋯ ⋯ ⋯ 0 0 ⋯ 0 β k − 2 α k − 1 β k − 1 0 0 ⋯ 0 β k − 1 α k ) T_k=\left(\begin{array}{cccccc} \alpha_1 & \beta_1 & 0 & \cdots & 0 & 0\\ \beta_1 & \alpha_2 & \beta_2 & 0 & \cdots & 0 \\ 0 & \beta_2 & \alpha_3 & \beta_3 & \cdots & 0\\ 0 & \cdots & \cdots & \cdots & \cdots & 0\\ 0 & \cdots & 0 & \beta_{k-2} & \alpha_{k-1} & \beta_{k-1}\\ 0 & 0 & \cdots & 0 & \beta_{k-1} & \alpha_k\\ \end{array}\right) Tk=⎝⎜⎜⎜⎜⎜⎜⎛α1β10000β1α2β2⋯⋯00β2α3⋯0⋯⋯0β3⋯βk−200⋯⋯⋯αk−1βk−10000βk−1αk⎠⎟⎟⎟⎟⎟⎟⎞
则有 A Q k = Q k T k + β k q k + 1 e k T AQ_k = Q_k T_k + \beta_{k} q_{k + 1} e_{k}^T AQk=QkTk+βkqk+1ekT,其中 e k ∈ R k e_k \in \boldsymbol{R}^k ek∈Rk且最后一个元素为1,其余元素为0,如果 β k ≠ = 0 , 1 ≤ k ≤ n − 1 \beta_k \neq = 0,1 \leq k \leq n - 1 βk==0,1≤k≤n−1,则可以顺利得到对称矩阵A的三对角分解。如果求解过程中某个 β k 0 = 0 \beta_{k_0}= 0 βk0=0,那么显然可以根据 A Q k = Q k T k AQ_k = Q_k T_k AQk=QkTk得到前 k 0 − 1 k_0 - 1 k0−1个特征向量,如果要计算剩下的特征向量和特征值,只需要重新初始化一个 q 1 q_1 q1即可。
MINRES主要应用于对称不定方程求解,考虑线性方程组如下\eqref{line}所示,其中 A ∈ R n × n A \in \boldsymbol{R}^{n \times n} A∈Rn×n是对称矩阵。
A x = b Ax = b Ax=b
MINRES算法的出发点是寻求一个向量 x ( k ) ∈ x ( 0 ) + κ k ( A , r 0 ) x^{(k)} \in x^{(0)} + \kappa_k(A,r_0) x(k)∈x(0)+κk(A,r0),即
x k = min x ∈ x 0 + κ k ( A , r 0 ) ∥ A x − b ∥ 2 . x^{k} = \min_{x \in x^0 + \kappa_k(A,r_0)} \| Ax -b \|_2. xk=minx∈x0+κk(A,r0)∥Ax−b∥2.
其中 κ k ( A , r 0 ) = ( r 0 , A r 0 , … , A k − 1 r 0 ) \kappa_k(A,r_0) = (r_0,Ar_0,\ldots,A^{k-1}r_0) κk(A,r0)=(r0,Ar0,…,Ak−1r0)形成的向量空间。
选择 q 1 = r 0 ∥ r 0 ∥ , r 0 = b − A x 0 q_1 = \frac{r^0}{\|r^0\|},r^0 = b - Ax^0 q1=∥r0∥r0,r0=b−Ax0,则 ( r 0 , A r 0 , … , A k − 1 r 0 ) = ( q 1 , A q 1 , … , A k − 1 q 1 ) ∥ r 0 ∥ 2 (r^0,Ar^0,\ldots,A^{k-1}r^0)=(q_1,Aq_1,\ldots,A^{k-1}q_1)\|r^0\|_2 (r0,Ar0,…,Ak−1r0)=(q1,Aq1,…,Ak−1q1)∥r0∥2,再根据 A q k = β k − 1 q k − 1 + α k q k + β k q k + 1 Aq_k = \beta_{k - 1}q_{k - 1} + \alpha_k q_k + \beta_k q_{k+1} Aqk=βk−1qk−1+αkqk+βkqk+1可得 q k + 1 ∈ κ k ( q 1 ) = κ k ( r 0 ) q_{k+1} \in \kappa_{k}(q_1) = \kappa_{k} (r^0) qk+1∈κk(q1)=κk(r0),由此可以得到对于任意一个向量 x x x属于 Q k Q_k Qk张成的向量空间,则必有 x ∈ κ k ( r 0 ) x \in \kappa_{k}(r^0) x∈κk(r0),反之亦然。
引入 Q k + 1 = [ Q k , q k + 1 ] Q_{k+1} = [Q_k,q_{k+1}] Qk+1=[Qk,qk+1],则根据上面的推导可以得到 A Q k = Q k + 1 T k + 1 , k AQ_k = Q_{k+1}T_{k+1,k} AQk=Qk+1Tk+1,k,
T k + 1 , k = ( T k β k e k T ) T_{k+1,k}=\left(\begin{array}{c} T_k \\ \beta_k e_k^T \end{array}\right) Tk+1,k=(TkβkekT)
于是乎可以把原始问题转换为下面这个形式,特别注意,这里的 e 1 ∈ R k + 1 e_1 \in \boldsymbol{R}^{k+1} e1∈Rk+1,上面的 e k ∈ R k e_k \in \boldsymbol{R}^k ek∈Rk:
{ x k = min x ∈ x 0 + κ k ( A , r 0 ) ∥ A x − b ∥ 2 = min y ∈ R k ∥ A ( x 0 + Q k y ) − b ∥ 2 = min y ∈ R k ∥ r 0 − A Q k y ∥ 2 = min y ∈ R k ∥ r 0 − Q k + 1 T k + 1 , k y ∥ 2 = min y ∈ R k ∥ ∥ r 0 ∥ 2 q 1 − Q k + 1 T k + 1 , k y ∥ 2 = min y ∈ R k ∥ ∥ r 0 ∥ 2 Q k + 1 e 1 − Q k + 1 T k + 1 , k y ∥ 2 = min y ∈ R k ∥ Q k + 1 ( ∥ r 0 ∥ 2 e 1 − T k + 1 , k y ) ∥ 2 = min y ∈ R k ∥ ∥ r 0 ∥ 2 e 1 − T k + 1 , k y ∥ 2 . \left\{\begin{aligned} x^{k} &= \min_{x \in x^0 + \kappa_k(A,r_0)} \| Ax -b \|_2 \\ & = \min_{y \in \boldsymbol{R}^{k}} \| A(x^0 + Q_k y) -b \|_2 \\ & = \min_{y \in \boldsymbol{R}^{k}} \| r^0 - AQ_k y \|_2 \\ & = \min_{y \in \boldsymbol{R}^{k}} \| r^0 - Q_{k+1}T_{k+1,k} y\|_2 \\ & = \min_{y \in \boldsymbol{R}^{k}} \| \|r^0\|_2 q_1 - Q_{k+1}T_{k+1,k} y\|_2 \\ & = \min_{y \in \boldsymbol{R}^{k}} \| \|r^0\|_2 Q_{k+1}e_1 - Q_{k+1}T_{k+1,k} y\|_2 \\ & = \min_{y \in \boldsymbol{R}^{k}} \| Q_{k + 1}(\|r^0\|_2 e_1 - T_{k+1,k} y)\|_2 \\ & = \min_{y \in \boldsymbol{R}^{k}} \| \|r^0\|_2 e_1 - T_{k+1,k} y\|_2. \end{aligned}\right. ⎩⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎧xk=x∈x0+κk(A,r0)min∥Ax−b∥2=y∈Rkmin∥A(x0+Qky)−b∥2=y∈Rkmin∥r0−AQky∥2=y∈Rkmin∥r0−Qk+1Tk+1,ky∥2=y∈Rkmin∥∥r0∥2q1−Qk+1Tk+1,ky∥2=y∈Rkmin∥∥r0∥2Qk+1e1−Qk+1Tk+1,ky∥2=y∈Rkmin∥Qk+1(∥r0∥2e1−Tk+1,ky)∥2=y∈Rkmin∥∥r0∥2e1−Tk+1,ky∥2.
引入记号 a = ∥ r 0 ∥ 2 a = \|r^0\|_2 a=∥r0∥2,原始问题进一步转化为求下面这个问题,对于这个问题,我们引入QR分解来处理。
min y ∈ R k ∥ a e 1 − T k + 1 , k y ∥ 2 . \min_{y \in \boldsymbol{R}^k} \| a e_1 - T_{k + 1,k} y \|_2. miny∈Rk∥ae1−Tk+1,ky∥2.
考虑矩阵 T k + 1 , k T_{k+1,k} Tk+1,k的QR分解,为了区分上面对称矩阵 A A A的正交分解,这里我们假设 T k + 1 , k = V k + 1 R k + 1 , k T_{k+1,k} = V_{k+1}R_{k+1,k} Tk+1,k=Vk+1Rk+1,k,其中 V k + 1 ∈ R ( k + 1 ) × ( k + 1 ) V_{k+1} \in R^{(k+1) \times (k+1)} Vk+1∈R(k+1)×(k+1)是正交矩阵,
R k + 1 , k = ( r 0 , 0 ⋯ ⋯ 0 r 1 , 1 ⋯ 0 0 ⋯ ⋯ 0 0 ⋯ ⋯ r k − 1 , k − 1 0 0 ⋯ 0 0 0 ) = ( R k 0 T ) R_{k+1,k}=\left(\begin{array}{cccccc} r_{0,0} & & & \cdots & \cdots & \\ 0 & r_{1,1} & & & \cdots & \\ 0 & 0 & \cdots & \cdots & & \\ 0 & 0 & \cdots & \cdots & &r_{k-1,k-1} \\ 0 & 0 & \cdots & 0 & 0 & 0\\ \end{array}\right) =\left(\begin{array}{c} R_k \\ 0^T \\ \end{array}\right) Rk+1,k=⎝⎜⎜⎜⎜⎛r0,00000r1,1000⋯⋯⋯⋯⋯⋯0⋯⋯0rk−1,k−10⎠⎟⎟⎟⎟⎞=(Rk0T)
有了上面的介绍以后,我们直接带入问题就可以得到
∥ a e 1 − T k + 1 , k y ∥ 2 = ∥ a e 1 − V k + 1 R k + 1 , k y ) ∥ 2 = ∥ V k + 1 ( V k + 1 ⊤ ( a e 1 ) − [ R k 0 ] y ) ∥ 2 = ∥ ( V k + 1 ⊤ ( a e 1 ) − [ R k y 0 ] ) ∥ 2 = ∥ [ V k + 1 , k , v k + 1 ] ⊤ ( a e 1 ) − [ R k y 0 ] ∥ 2 , \begin{aligned} \left\|a e_1-T_{k+1, k} y\right\|_2 & \left.=\| a e_1-V_{k+1} R_{k+1, k} y\right) \|_2 \\ & =\left\|V_{k+1}\left(V_{k+1}^{\top}\left(a e_1\right)-\left[\begin{array}{c} R_k \\ 0 \end{array}\right] y\right)\right\|_2 \\ & =\left\|\left(V_{k+1}^{\top}\left(a e_1\right)-\left[\begin{array}{c} R_k y\\ 0 \end{array}\right] \right)\right\|_2 \\ & =\left\|\left[V_{k+1, k}, v_{k+1}\right]^{\top}\left(a e_1\right)-\left[\begin{array}{c} R_k y \\ 0 \end{array}\right]\right\|_2, \end{aligned} ∥ae1−Tk+1,ky∥2=∥ae1−Vk+1Rk+1,ky)∥2=∥∥∥∥Vk+1(Vk+1⊤(ae1)−[Rk0]y)∥∥∥∥2=∥∥∥∥(Vk+1⊤(ae1)−[Rky0])∥∥∥∥2=∥∥∥∥[Vk+1,k,vk+1]⊤(ae1)−[Rky0]∥∥∥∥2,
如果 R k R_k Rk非奇异,那么显然取 y y y满足 V k + 1 , k T a e 1 − R k y = 0 V_{k + 1,k}^{T} a e_1 - R_{k}y = 0 Vk+1,kTae1−Rky=0即可,此时有$y = a R_{k}^{-1} V_{k+1,k}^T e_1 , 因 此 根 据 M I N R E S 算 法 得 到 的 第 ,因此根据MINRES算法得到的第 ,因此根据MINRES算法得到的第k$次迭代向量如下:
x k = x 0 + a Q k R k − 1 V k + 1 , k T e 1 x^{k} = x^0 + a Q_k R_{k}^{-1} V_{k+1,k}^T e_1 xk=x0+aQkRk−1Vk+1,kTe1
首先介绍Lanczos分解,Lanzos把对称矩阵转换为一个三对角对称矩阵。考虑三对角对称矩阵如下,考虑正交分解
T = Q T A Q T = Q^T A Q T=QTAQ
T = ( α 1 β 1 0 ⋯ 0 0 β 1 α 2 β 2 0 ⋯ 0 0 β 2 α 3 β 3 ⋯ 0 0 ⋯ ⋯ ⋯ ⋯ 0 0 ⋯ 0 β n − 2 α n − 1 β n − 1 0 0 ⋯ 0 β n − 1 α n ) T=\left(\begin{array}{cccccc} \alpha_1 & \beta_1 & 0 & \cdots & 0 & 0\\ \beta_1 & \alpha_2 & \beta_2 & 0 & \cdots & 0 \\ 0 & \beta_2 & \alpha_3 & \beta_3 & \cdots & 0\\ 0 & \cdots & \cdots & \cdots & \cdots & 0\\ 0 & \cdots & 0 & \beta_{n-2} & \alpha_{n-1} & \beta_{n-1}\\ 0 & 0 & \cdots & 0 & \beta_{n-1} & \alpha_n \end{array}\right) T=⎝⎜⎜⎜⎜⎜⎜⎛α1β10000β1α2β2⋯⋯00β2α3⋯0⋯⋯0β3⋯βn−200⋯⋯⋯αn−1βn−10000βn−1αn⎠⎟⎟⎟⎟⎟⎟⎞
下面重点考虑正交矩阵 Q Q Q和三对角矩阵 T T T的形成,我们记 Q = [ q 1 , q 2 , … , q n ] , q i ∈ R n Q=[q_1,q_2,\ldots,q_n],q_i \in R^n Q=[q1,q2,…,qn],qi∈Rn,则根据 Q T = A Q QT = AQ QT=AQ,我们会得到下面这个等式,约定 β 0 q 0 = β n q n = 0 \beta_0 q_0 = \beta_n q_n = 0 β0q0=βnqn=0:
A q i = β i − 1 q i − 1 + α i q i + β i q i + 1 , 1 ≤ i ≤ n . Aq_i = \beta_{i - 1}q_{i - 1} + \alpha_i q_i + \beta_i q_{i + 1},1 \leq i \leq n. Aqi=βi−1qi−1+αiqi+βiqi+1,1≤i≤n.
我们先考虑 α 1 , β 1 \alpha_1,\beta_1 α1,β1的确定,任意取一个向量 q 1 ∈ R n , ∥ q 1 ∥ 2 = 1 q_1 \in R^n, \| q_1 \|_2 = 1 q1∈Rn,∥q1∥2=1,则有
{ A q 1 = α 1 q 1 + β 1 q 2 , α 1 = q 1 T A q 1 , β 1 = ∥ A q 1 − α 1 q 1 ∥ 2 , q 2 = ( A q 1 − α 1 q 1 ) / β 1 . \left\{\begin{array}{l} Aq_1 = \alpha_1 q_1 + \beta_1 q_2,\\ \alpha_1 = q_1^T A q_1,\\ \beta_1 = \|Aq_1 - \alpha_1 q_1\|_2,\\ q_2 = (Aq_1 - \alpha_1 q_1)/\beta_1. \end{array}\right. ⎩⎪⎪⎨⎪⎪⎧Aq1=α1q1+β1q2,α1=q1TAq1,β1=∥Aq1−α1q1∥2,q2=(Aq1−α1q1)/β1.
类似地,假设我们已经得到了 [ q 1 , q 2 , … , q k ] [q_1,q_2,\ldots,q_k] [q1,q2,…,qk]和 [ α 1 , α 2 , … , α k − 1 ] , [ β 1 , β 2 , … , β k − 1 ] [\alpha_1,\alpha_2,\ldots,\alpha_{k - 1}],[\beta_1,\beta_2,\ldots,\beta_{k-1}] [α1,α2,…,αk−1],[β1,β2,…,βk−1],下面同样可以类似得到 α k , β k , q k + 1 \alpha_k,\beta_k,q_{k+1} αk,βk,qk+1,
{ A q k = β k − 1 q k − 1 + α k q k + β k q k + 1 , α k = q k T ( A q k − β k − 1 q k − 1 ) , β k = ∥ A q k − β k − 1 q k − 1 ∥ 2 , q k + 1 = ( A q k − β k − 1 q k − 1 ) / β k . \left\{\begin{array}{l} Aq_k = \beta_{k - 1}q_{k - 1} + \alpha_k q_k + \beta_k q_{k+1},\\ \alpha_k = q_k^T (Aq_k - \beta_{k - 1}q_{k - 1}),\\ \beta_k = \|Aq_k - \beta_{k - 1}q_{k - 1}\|_2,\\ q_{k+1} = (Aq_k - \beta_{k - 1}q_{k - 1})/\beta_k. \end{array}\right. ⎩⎪⎪⎨⎪⎪⎧Aqk=βk−1qk−1+αkqk+βkqk+1,αk=qkT(Aqk−βk−1qk−1),βk=∥Aqk−βk−1qk−1∥2,qk+1=(Aqk−βk−1qk−1)/βk.
引入记号 Q k = [ q 1 , q 2 , … , q k ] Q_k = [q_1,q_2,\ldots,q_k] Qk=[q1,q2,…,qk],和 T k T_k Tk,其中有
T k = ( α 1 β 1 0 ⋯ 0 0 β 1 α 2 β 2 0 ⋯ 0 0 β 2 α 3 β 3 ⋯ 0 0 ⋯ ⋯ ⋯ ⋯ 0 0 ⋯ 0 β k − 2 α k − 1 β k − 1 0 0 ⋯ 0 β k − 1 α k ) T_k=\left(\begin{array}{cccccc} \alpha_1 & \beta_1 & 0 & \cdots & 0 & 0\\ \beta_1 & \alpha_2 & \beta_2 & 0 & \cdots & 0 \\ 0 & \beta_2 & \alpha_3 & \beta_3 & \cdots & 0\\ 0 & \cdots & \cdots & \cdots & \cdots & 0\\ 0 & \cdots & 0 & \beta_{k-2} & \alpha_{k-1} & \beta_{k-1}\\ 0 & 0 & \cdots & 0 & \beta_{k-1} & \alpha_k\\ \end{array}\right) Tk=⎝⎜⎜⎜⎜⎜⎜⎛α1β10000β1α2β2⋯⋯00β2α3⋯0⋯⋯0β3⋯βk−200⋯⋯⋯αk−1βk−10000βk−1αk⎠⎟⎟⎟⎟⎟⎟⎞
则有 A Q k = Q k T k + β k q k + 1 e k T AQ_k = Q_k T_k + \beta_{k} q_{k + 1} e_{k}^T AQk=QkTk+βkqk+1ekT,其中 e k ∈ R k e_k \in \boldsymbol{R}^k ek∈Rk且最后一个元素为1,其余元素为0,如果 β k ≠ = 0 , 1 ≤ k ≤ n − 1 \beta_k \neq = 0,1 \leq k \leq n - 1 βk==0,1≤k≤n−1,则可以顺利得到对称矩阵A的三对角分解。如果求解过程中某个 β k 0 = 0 \beta_{k_0}= 0 βk0=0,那么显然可以根据 A Q k = Q k T k AQ_k = Q_k T_k AQk=QkTk得到前 k 0 − 1 k_0 - 1 k0−1个特征向量,如果要计算剩下的特征向量和特征值,只需要重新初始化一个 q 1 q_1 q1即可。
MINRES主要应用于对称不定方程求解,考虑线性方程组如下\eqref{line}所示,其中 A ∈ R n × n A \in \boldsymbol{R}^{n \times n} A∈Rn×n是对称矩阵。
A x = b Ax = b Ax=b
MINRES算法的出发点是寻求一个向量 x ( k ) ∈ x ( 0 ) + κ k ( A , r 0 ) x^{(k)} \in x^{(0)} + \kappa_k(A,r_0) x(k)∈x(0)+κk(A,r0),即
x k = min x ∈ x 0 + κ k ( A , r 0 ) ∥ A x − b ∥ 2 . x^{k} = \min_{x \in x^0 + \kappa_k(A,r_0)} \| Ax -b \|_2. xk=minx∈x0+κk(A,r0)∥Ax−b∥2.
其中 κ k ( A , r 0 ) = ( r 0 , A r 0 , … , A k − 1 r 0 ) \kappa_k(A,r_0) = (r_0,Ar_0,\ldots,A^{k-1}r_0) κk(A,r0)=(r0,Ar0,…,Ak−1r0)形成的向量空间。
选择 q 1 = r 0 ∥ r 0 ∥ , r 0 = b − A x 0 q_1 = \frac{r^0}{\|r^0\|},r^0 = b - Ax^0 q1=∥r0∥r0,r0=b−Ax0,则 ( r 0 , A r 0 , … , A k − 1 r 0 ) = ( q 1 , A q 1 , … , A k − 1 q 1 ) ∥ r 0 ∥ 2 (r^0,Ar^0,\ldots,A^{k-1}r^0)=(q_1,Aq_1,\ldots,A^{k-1}q_1)\|r^0\|_2 (r0,Ar0,…,Ak−1r0)=(q1,Aq1,…,Ak−1q1)∥r0∥2,再根据 A q k = β k − 1 q k − 1 + α k q k + β k q k + 1 Aq_k = \beta_{k - 1}q_{k - 1} + \alpha_k q_k + \beta_k q_{k+1} Aqk=βk−1qk−1+αkqk+βkqk+1可得 q k + 1 ∈ κ k ( q 1 ) = κ k ( r 0 ) q_{k+1} \in \kappa_{k}(q_1) = \kappa_{k} (r^0) qk+1∈κk(q1)=κk(r0),由此可以得到对于任意一个向量 x x x属于 Q k Q_k Qk张成的向量空间,则必有 x ∈ κ k ( r 0 ) x \in \kappa_{k}(r^0) x∈κk(r0),反之亦然。
引入 Q k + 1 = [ Q k , q k + 1 ] Q_{k+1} = [Q_k,q_{k+1}] Qk+1=[Qk,qk+1],则根据上面的推导可以得到 A Q k = Q k + 1 T k + 1 , k AQ_k = Q_{k+1}T_{k+1,k} AQk=Qk+1Tk+1,k,
T k + 1 , k = ( T k β k e k T ) T_{k+1,k}=\left(\begin{array}{c} T_k \\ \beta_k e_k^T \end{array}\right) Tk+1,k=(TkβkekT)
于是乎可以把原始问题转换为下面这个形式,特别注意,这里的 e 1 ∈ R k + 1 e_1 \in \boldsymbol{R}^{k+1} e1∈Rk+1,上面的 e k ∈ R k e_k \in \boldsymbol{R}^k ek∈Rk:
{ x k = min x ∈ x 0 + κ k ( A , r 0 ) ∥ A x − b ∥ 2 = min y ∈ R k ∥ A ( x 0 + Q k y ) − b ∥ 2 = min y ∈ R k ∥ r 0 − A Q k y ∥ 2 = min y ∈ R k ∥ r 0 − Q k + 1 T k + 1 , k y ∥ 2 = min y ∈ R k ∥ ∥ r 0 ∥ 2 q 1 − Q k + 1 T k + 1 , k y ∥ 2 = min y ∈ R k ∥ ∥ r 0 ∥ 2 Q k + 1 e 1 − Q k + 1 T k + 1 , k y ∥ 2 = min y ∈ R k ∥ Q k + 1 ( ∥ r 0 ∥ 2 e 1 − T k + 1 , k y ) ∥ 2 = min y ∈ R k ∥ ∥ r 0 ∥ 2 e 1 − T k + 1 , k y ∥ 2 . \left\{\begin{aligned} x^{k} &= \min_{x \in x^0 + \kappa_k(A,r_0)} \| Ax -b \|_2 \\ & = \min_{y \in \boldsymbol{R}^{k}} \| A(x^0 + Q_k y) -b \|_2 \\ & = \min_{y \in \boldsymbol{R}^{k}} \| r^0 - AQ_k y \|_2 \\ & = \min_{y \in \boldsymbol{R}^{k}} \| r^0 - Q_{k+1}T_{k+1,k} y\|_2 \\ & = \min_{y \in \boldsymbol{R}^{k}} \| \|r^0\|_2 q_1 - Q_{k+1}T_{k+1,k} y\|_2 \\ & = \min_{y \in \boldsymbol{R}^{k}} \| \|r^0\|_2 Q_{k+1}e_1 - Q_{k+1}T_{k+1,k} y\|_2 \\ & = \min_{y \in \boldsymbol{R}^{k}} \| Q_{k + 1}(\|r^0\|_2 e_1 - T_{k+1,k} y)\|_2 \\ & = \min_{y \in \boldsymbol{R}^{k}} \| \|r^0\|_2 e_1 - T_{k+1,k} y\|_2. \end{aligned}\right. ⎩⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎧xk=x∈x0+κk(A,r0)min∥Ax−b∥2=y∈Rkmin∥A(x0+Qky)−b∥2=y∈Rkmin∥r0−AQky∥2=y∈Rkmin∥r0−Qk+1Tk+1,ky∥2=y∈Rkmin∥∥r0∥2q1−Qk+1Tk+1,ky∥2=y∈Rkmin∥∥r0∥2Qk+1e1−Qk+1Tk+1,ky∥2=y∈Rkmin∥Qk+1(∥r0∥2e1−Tk+1,ky)∥2=y∈Rkmin∥∥r0∥2e1−Tk+1,ky∥2.
引入记号 a = ∥ r 0 ∥ 2 a = \|r^0\|_2 a=∥r0∥2,原始问题进一步转化为求下面这个问题,对于这个问题,我们引入QR分解来处理。
min y ∈ R k ∥ a e 1 − T k + 1 , k y ∥ 2 . \min_{y \in \boldsymbol{R}^k} \| a e_1 - T_{k + 1,k} y \|_2. miny∈Rk∥ae1−Tk+1,ky∥2.
考虑矩阵 T k + 1 , k T_{k+1,k} Tk+1,k的QR分解,为了区分上面对称矩阵 A A A的正交分解,这里我们假设 T k + 1 , k = V k + 1 R k + 1 , k T_{k+1,k} = V_{k+1}R_{k+1,k} Tk+1,k=Vk+1Rk+1,k,其中 V k + 1 ∈ R ( k + 1 ) × ( k + 1 ) V_{k+1} \in R^{(k+1) \times (k+1)} Vk+1∈R(k+1)×(k+1)是正交矩阵,
R k + 1 , k = ( r 0 , 0 ⋯ ⋯ 0 r 1 , 1 ⋯ 0 0 ⋯ ⋯ 0 0 ⋯ ⋯ r k − 1 , k − 1 0 0 ⋯ 0 0 0 ) = ( R k 0 T ) R_{k+1,k}=\left(\begin{array}{cccccc} r_{0,0} & & & \cdots & \cdots & \\ 0 & r_{1,1} & & & \cdots & \\ 0 & 0 & \cdots & \cdots & & \\ 0 & 0 & \cdots & \cdots & &r_{k-1,k-1} \\ 0 & 0 & \cdots & 0 & 0 & 0\\ \end{array}\right) =\left(\begin{array}{c} R_k \\ 0^T \\ \end{array}\right) Rk+1,k=⎝⎜⎜⎜⎜⎛r0,00000r1,1000⋯⋯⋯⋯⋯⋯0⋯⋯0rk−1,k−10⎠⎟⎟⎟⎟⎞=(Rk0T)
有了上面的介绍以后,我们直接带入问题就可以得到
∥ a e 1 − T k + 1 , k y ∥ 2 = ∥ a e 1 − V k + 1 R k + 1 , k y ) ∥ 2 = ∥ V k + 1 ( V k + 1 ⊤ ( a e 1 ) − [ R k 0 ] y ) ∥ 2 = ∥ ( V k + 1 ⊤ ( a e 1 ) − [ R k y 0 ] ) ∥ 2 = ∥ [ V k + 1 , k , v k + 1 ] ⊤ ( a e 1 ) − [ R k y 0 ] ∥ 2 , \begin{aligned} \left\|a e_1-T_{k+1, k} y\right\|_2 & \left.=\| a e_1-V_{k+1} R_{k+1, k} y\right) \|_2 \\ & =\left\|V_{k+1}\left(V_{k+1}^{\top}\left(a e_1\right)-\left[\begin{array}{c} R_k \\ 0 \end{array}\right] y\right)\right\|_2 \\ & =\left\|\left(V_{k+1}^{\top}\left(a e_1\right)-\left[\begin{array}{c} R_k y\\ 0 \end{array}\right] \right)\right\|_2 \\ & =\left\|\left[V_{k+1, k}, v_{k+1}\right]^{\top}\left(a e_1\right)-\left[\begin{array}{c} R_k y \\ 0 \end{array}\right]\right\|_2, \end{aligned} ∥ae1−Tk+1,ky∥2=∥ae1−Vk+1Rk+1,ky)∥2=∥∥∥∥Vk+1(Vk+1⊤(ae1)−[Rk0]y)∥∥∥∥2=∥∥∥∥(Vk+1⊤(ae1)−[Rky0])∥∥∥∥2=∥∥∥∥[Vk+1,k,vk+1]⊤(ae1)−[Rky0]∥∥∥∥2,
如果 R k R_k Rk非奇异,那么显然取 y y y满足 V k + 1 , k T a e 1 − R k y = 0 V_{k + 1,k}^{T} a e_1 - R_{k}y = 0 Vk+1,kTae1−Rky=0即可,此时有$y = a R_{k}^{-1} V_{k+1,k}^T e_1 , 因 此 根 据 M I N R E S 算 法 得 到 的 第 ,因此根据MINRES算法得到的第 ,因此根据MINRES算法得到的第k$次迭代向量如下:
x k = x 0 + a Q k R k − 1 V k + 1 , k T e 1 x^{k} = x^0 + a Q_k R_{k}^{-1} V_{k+1,k}^T e_1 xk=x0+aQkRk−1Vk+1,kTe1
我们希望可以在 T k , k − 1 T_{k,k-1} Tk,k−1的QR分解基础上做一次Givens变化得到 T k + 1 , k T_{k+1,k} Tk+1,k的QR分解,假设 T k , k − 1 T_{k,k-1} Tk,k−1的QR分解如下:
T k , k − 1 = ( G k − 1 G k − 2 … G 1 ) T R k , k − 1 = V k R k , k − 1 T_{k,k-1} = (G_{k-1} G_{k-2} \ldots G_1)^T R_{k,k-1} = V_{k} R_{k,k-1} Tk,k−1=(Gk−1Gk−2…G1)TRk,k−1=VkRk,k−1
其中
R k , k − 1 = ( r 0 , 0 ⋯ ⋯ 0 r 1 , 1 ⋯ 0 0 ⋯ ⋯ 0 0 ⋯ ⋯ r k − 1 , k − 1 0 0 ⋯ 0 0 0 ) = ( R k − 1 0 T ) R_{k,k - 1}= \left(\begin{array}{cccccc} r_{0,0} & & & \cdots & \cdots & \\ 0 & r_{1,1} & & & \cdots & \\ 0 & 0 & \cdots & \cdots & & \\ 0 & 0 & \cdots & \cdots & &r_{k-1,k-1} \\ 0 & 0 & \cdots & 0 & 0 & 0\\ \end{array}\right) =\left(\begin{array}{c} R_{k - 1} \\ 0^T \\ \end{array}\right) Rk,k−1=⎝⎜⎜⎜⎜⎛r0,00000r1,1000⋯⋯⋯⋯⋯⋯0⋯⋯0rk−1,k−10⎠⎟⎟⎟⎟⎞=(Rk−10T)
以及Givens变换矩阵形如下面这个形式:
G = ( 1 0 0 ⋯ ⋯ 0 0 0 1 0 0 ⋯ 0 0 0 0 ⋯ ⋯ 0 0 0 0 0 ⋯ c ⋯ s 0 0 0 ⋯ ⋯ ⋯ 0 0 0 0 ⋯ − s ⋯ c 0 0 0 ⋯ 0 ⋯ 0 1 ) G =\left(\begin{array}{ccccccc} 1 & 0 & 0 & \cdots & \cdots & 0 & 0\\ 0 & 1 & 0 & 0 & \cdots & 0 & 0\\ 0 & 0 & \cdots & \cdots & 0 & 0 & 0\\ 0 & 0 & \cdots & c & \cdots & s & 0\\ 0 & 0 & \cdots & \cdots & \cdots & 0 & 0\\ 0 & 0 & \cdots & -s & \cdots & c & 0\\ 0 & 0 & \cdots & 0 & \cdots & 0 & 1\\ \end{array}\right) G=⎝⎜⎜⎜⎜⎜⎜⎜⎜⎛1000000010000000⋯⋯⋯⋯⋯⋯0⋯c⋯−s0⋯⋯0⋯⋯⋯⋯000s0c00000001⎠⎟⎟⎟⎟⎟⎟⎟⎟⎞
针对这个问题使用的Givens变换 G i G_i Gi仅仅改变矩阵 i , i + 1 i,i+1 i,i+1的元素,因此 G i G_i Gi的具体形式如下:
G i = [ I i − 1 c i s i − s i c i I k − i − 1 ] G_i=\left[\begin{array}{cccc} I_{i-1} & & & \\ & c_i & s_i & \\ & -s_i & c_i & \\ & & & I_{k-i-1} \end{array}\right] Gi=⎣⎢⎢⎡Ii−1ci−sisiciIk−i−1⎦⎥⎥⎤
我们重新回忆一下 T k + 1 , k T_{k+1,k} Tk+1,k的形状,
{ T k + 1 , k = ( T k β k e k T ) = [ T k , k − 1 0 ⋮ β k − 1 α k 0 T β k ] = [ V k R k , k − 1 0 ⋮ β k − 1 α k 0 T β k ] = [ V k 0 0 1 ] [ R k , k − 1 V k − 1 [ 0 ⋮ 0 β k − 1 α k ] 0 β k ] = [ V k 0 0 1 ] T ~ k + 1 , k \left\{\begin{aligned} T_{k+1,k} &=\left(\begin{array}{c} T_k \\ \beta_k e_k^T \end{array}\right) =\left[\begin{array}{c|c} T_{k, k-1} & 0 \\ & \vdots \\ & \beta_{k-1} \\ & \alpha_k \\ \hline 0^T & \beta_k \end{array}\right] =\left[\begin{array}{c|c} V_{k}R_{k,k-1} & 0 \\ & \vdots \\ & \beta_{k-1} \\ & \alpha_k \\ \hline 0^T & \beta_k \end{array}\right] \\ & =\left[\begin{array}{cc} V_k & 0 \\ 0 & 1 \end{array}\right]\left[\begin{array}{c|c} R_{k, k-1} & V_k^{-1}\left[\begin{array}{c} 0 \\ \vdots \\ 0 \\ \beta_{k-1} \\ \alpha_k \end{array}\right] \\ \hline 0 & \beta_k \end{array}\right] \\ & = \left[\begin{array}{cc} V_k & 0 \\ 0 & 1 \end{array}\right] \tilde{T}_{k+1, k} \end{aligned}\right. ⎩⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎧Tk+1,k=(TkβkekT)=⎣⎢⎢⎢⎢⎢⎡Tk,k−10T0⋮βk−1αkβk⎦⎥⎥⎥⎥⎥⎤=⎣⎢⎢⎢⎢⎢⎡VkRk,k−10T0⋮βk−1αkβk⎦⎥⎥⎥⎥⎥⎤=[Vk001]⎣⎢⎢⎢⎢⎢⎢⎢⎡Rk,k−10Vk−1⎣⎢⎢⎢⎢⎢⎡0⋮0βk−1αk⎦⎥⎥⎥⎥⎥⎤βk⎦⎥⎥⎥⎥⎥⎥⎥⎤=[Vk001]T~k+1,k
而且另一方面:
Q k − 1 [ 0 ⋮ 0 0 β k − 1 α k ] = G k − 1 G k − 2 ⋯ G 1 [ 0 ⋮ 0 β k − 1 α k ] = G k − 1 G k − 2 [ 0 ⋮ 0 0 β k − 1 α k ] = [ 0 ⋮ 0 r k − 3 , k − 1 r k − 2 , k − 1 r k − 1 , k − 1 ] Q_k^{-1}\left[\begin{array}{c} 0 \\ \vdots \\ 0 \\ 0 \\ \beta_{k-1} \\ \alpha_k \end{array}\right]=G_{k-1} G_{k-2} \cdots G_1\left[\begin{array}{c} 0 \\ \vdots \\ 0 \\ \beta_{k-1} \\ \alpha_k \end{array}\right]=G_{k-1} G_{k-2}\left[\begin{array}{c} 0 \\ \vdots \\ 0 \\ 0 \\ \beta_{k-1} \\ \alpha_k \end{array}\right]=\left[\begin{array}{c} 0 \\ \vdots \\ 0 \\ r_{k-3,k-1} \\ r_{k-2,k-1} \\ r_{k-1,k-1} \end{array}\right] Qk−1⎣⎢⎢⎢⎢⎢⎢⎢⎡0⋮00βk−1αk⎦⎥⎥⎥⎥⎥⎥⎥⎤=Gk−1Gk−2⋯G1⎣⎢⎢⎢⎢⎢⎡0⋮0βk−1αk⎦⎥⎥⎥⎥⎥⎤=Gk−1Gk−2⎣⎢⎢⎢⎢⎢⎢⎢⎡0⋮00βk−1αk⎦⎥⎥⎥⎥⎥⎥⎥⎤=⎣⎢⎢⎢⎢⎢⎢⎢⎡0⋮0rk−3,k−1rk−2,k−1rk−1,k−1⎦⎥⎥⎥⎥⎥⎥⎥⎤
有了上述的介绍以后,我们发现只要选择一个Givens变换把 T ~ k + 1 , k \tilde{T}_{k+1, k} T~k+1,k右下角的 β k \beta_k βk消去即可,因此选择 G k G_k Gk如下:
G k = [ I k − 1 c k s k − s k c k ] ∈ R ( k + 1 ) × ( k + 1 ) G_k=\left[\begin{array}{ccc} I_{k-1} & & \\ & c_k & s_k \\ & -s_k & c_k \end{array}\right] \in \boldsymbol{R}^{(k+1) \times(k+1)} Gk=⎣⎡Ik−1ck−skskck⎦⎤∈R(k+1)×(k+1)
于是我们可以得到 T k + 1 , k T_{k+1,k} Tk+1,k的QR分解 T k + 1 , k = V k + 1 R k + 1 , k T_{k+1,k} = V_{k+1}R_{k+1,k} Tk+1,k=Vk+1Rk+1,k,这里假设Givens变换是矩阵 G G G的维度自动增长。
V k + 1 = [ V k 0 0 1 ] G k ⊤ = ( G k G ~ k − 1 ⋯ G ~ 1 ) ⊤ V_{k+1}=\left[\begin{array}{cc} V_k & 0 \\ 0 & 1 \end{array}\right] G_k^{\top}=\left(G_k \tilde{G}_{k-1} \cdots \tilde{G}_1\right)^{\top} Vk+1=[Vk001]Gk⊤=(GkG~k−1⋯G~1)⊤
注意,由于矩阵 T k + 1 , k T_{k+1,k} Tk+1,k是三对角矩阵,而Givens变换只改变两行两列元素,因此 R k + 1 , k R_{k+1,k} Rk+1,k是一个上三角的带宽为3的带状矩阵,以及观察这个变换过程可以发现 R k − 1 R_{k-1} Rk−1其实是 R k R_k Rk的 ( k − 1 ) (k-1) (k−1)阶顺序主子阵。
整理上文,我们有
{ A = Q T T Q , r 0 = b − A x 0 = ∥ r 0 ∥ 2 q 1 = a q 1 , Q k = [ q 1 , q 2 , … , q k ] , x k = x 0 + Q k y , min y ∈ R k ∥ b − A x 0 − A Q k y ∥ = min y ∈ R k ∥ r 0 − A Q k y ∥ , min y ∈ R k ∥ r 0 − A Q k y ∥ = min y ∈ R k ∥ a Q k e 1 − A Q k y ∥ = min y ∈ R k ∥ a Q k e 1 − Q k T k + 1 , k y ∥ , min y ∈ R k ∥ a e 1 − T k + 1 , k y ∥ 2 = min y ∈ R k ∥ a e 1 − V k + 1 R k + 1 , k y ∥ 2 , y = a R k − 1 V k + 1 , k T e 1 , T k + 1 , k = ( T k β k e k T ) , V