Chapter6 线性方程组的迭代解法
6.1 范数
6.1.1 向量范数
- 定义:对任意n维向量 x \boldsymbol{x} x,有对应的一个非负实数 ∣ ∣ x ∣ ∣ ||\boldsymbol{x}|| ∣∣x∣∣满足->
- 非负性: ∣ ∣ x ∣ ∣ ⩾ 0 ||x||\geqslant 0 ∣∣x∣∣⩾0,并且 ∣ ∣ x ∣ ∣ = 0 → x = 0 ||x||=0 \rightarrow x=0 ∣∣x∣∣=0→x=0
- 齐次性:对于任意实数 γ \gamma γ,有 ∣ ∣ γ x ∣ ∣ = γ ∣ ∣ x ∣ ∣ ||\gamma x|| = \gamma ||x|| ∣∣γx∣∣=γ∣∣x∣∣
- 三角不等性: ∣ ∣ x ∣ ∣ + ∣ ∣ y ∣ ∣ ⩾ ∣ ∣ x + y ∣ ∣ ||x||+||y|| \geqslant ||x+y|| ∣∣x∣∣+∣∣y∣∣⩾∣∣x+y∣∣
- 分类
- 1-norms:
norm(x,1)
∣ ∣ v ∣ ∣ = ∣ v 1 ∣ + ∣ v 2 ∣ + ∣ v 3 ∣ + ⋯ + ∣ v n ∣ = ∑ i = 1 n ∣ v i ∣ ||v||=|v_1|+|v_2|+|v_3|+\dots+|v_n|=\sum_{i=1}^{n}|v_i| ∣∣v∣∣=∣v1∣+∣v2∣+∣v3∣+⋯+∣vn∣=i=1∑n∣vi∣
- 2-norms(欧几里得范数):
norm(x,2)
∥ x ∥ 2 = v 1 2 + v 2 2 + ⋯ + v n 2 = ∑ i = 1 n v i 2 \|x\|_2=\sqrt{v_1^2+v_2^2+\cdots+v_n^2}=\sqrt{\sum_{i=1}^n v_i^2} ∥x∥2=v12+v22+⋯+vn2 =i=1∑nvi2
- ∞ \infty ∞-norms:
norm(x,inf)
∥ x ∥ ∞ = max 1 ⩽ j ⩽ n ∣ x j ∣ \|x\|_{\infty}=\max _{1 \leqslant j \leqslant n}\left|x_j\right| ∥x∥∞=1⩽j⩽nmax∣xj∣
6.1.2 矩阵范数
- 定义:对任意n阶方阵 A \boldsymbol{A} A,有对应的一个非负实数 ∣ ∣ A ∣ ∣ ||\boldsymbol{A}|| ∣∣A∣∣满足->
- 非负性: ∣ ∣ A ∣ ∣ ⩾ 0 ||\boldsymbol{A}||\geqslant 0 ∣∣A∣∣⩾0,并且 ∣ ∣ A ∣ ∣ = 0 → A = 0 ||\boldsymbol{A}||=0 \rightarrow \boldsymbol{A}=\boldsymbol{0} ∣∣A∣∣=0→A=0
- 齐次性:对于任意实数 γ \gamma γ,有 ∣ ∣ γ A ∣ ∣ = γ ∣ ∣ A ∣ ∣ ||\gamma \boldsymbol{A}|| = \gamma ||\boldsymbol{A}|| ∣∣γA∣∣=γ∣∣A∣∣
- 三角不等性: ∣ ∣ A ∣ ∣ + ∣ ∣ B ∣ ∣ ⩾ ∣ ∣ A + B ∣ ∣ ||\boldsymbol{A}||+||\boldsymbol{B}|| \geqslant ||\boldsymbol{A}+\boldsymbol{B}|| ∣∣A∣∣+∣∣B∣∣⩾∣∣A+B∣∣
- 相容性: ∣ ∣ A B ∣ ∣ ≤ ∣ ∣ A ∣ ∣ ⋅ ∣ ∣ B ∣ ∣ ||\boldsymbol{AB}||\leq||\boldsymbol{A}||\cdot||\boldsymbol{B}|| ∣∣AB∣∣≤∣∣A∣∣⋅∣∣B∣∣
- 谱半径:矩阵特征值的最大值
ρ ( A ) = max 1 ≤ j ≤ n ∣ λ i ∣ \rho(\bold{A})=\max _{1 \leq j \leq n}|\lambda_{i}| ρ(A)=1≤j≤nmax∣λi∣
- 分类
- 1-norms(列模):
norm(A,1)
∥ A ∥ 1 = max 1 ≤ j ≤ n ∑ i = 1 n ∣ a i j ∣ \|A\|_1=\max _{1 \leq j \leq n} \sum_{i=1}^n\left|a_{i j}\right| ∥A∥1=1≤j≤nmaxi=1∑n∣aij∣
- 2-norms(谱模):
norm(A,2)
∥ A ∥ 2 = ρ ( A T A ) \lVert A\rVert_2 =\sqrt{\rho(A^TA)} ∥A∥2=ρ(ATA)
- ∞ \infty ∞-norms(行模):
norm(A,inf)
∥ A ∥ 1 = max 1 ≤ i ≤ n ∑ j = 1 n ∣ a i j ∣ \|A\|_1=\max _{1 \leq i \leq n} \sum_{j=1}^n\left|a_{i j}\right| ∥A∥1=1≤i≤nmaxj=1∑n∣aij∣
- F-norms(vector 2-norms):
norm(A,'fro')
∥ A ∥ F = ( ∑ i = 1 m ∑ j = 1 n ∣ a i j ∣ 2 ) 1 2 \|\mathrm{A}\|_{\mathrm{F}}=\left(\sum_{\mathrm{i}=1}^{\mathrm{m}} \sum_{\mathrm{j}=1}^{\mathrm{n}}\left|\mathrm{a}_{\mathrm{ij}}\right|^2\right)^{\frac{1}{2}} ∥A∥F= i=1∑mj=1∑n∣aij∣2 21
6.1.3 相容关系
- 定义:若如下不等式成立,我们称向量范数和矩阵范数相容
∣ ∣ A x ∣ ∣ ≤ ∣ ∣ A ∣ ∣ ⋅ ∣ ∣ x ∣ ∣ ||\boldsymbol{Ax}||\leq||\boldsymbol{A}||\cdot||\boldsymbol{x}|| ∣∣Ax∣∣≤∣∣A∣∣⋅∣∣x∣∣
- 其余范数应满足:
∣ ∣ A x ∣ ∣ 1 ≤ ∣ ∣ A ∣ ∣ 1 ⋅ ∣ ∣ x ∣ ∣ 1 ∣ ∣ A x ∣ ∣ 2 ≤ ∣ ∣ A ∣ ∣ 2 ⋅ ∣ ∣ x ∣ ∣ 2 ∣ ∣ A x ∣ ∣ ∞ ≤ ∣ ∣ A ∣ ∣ ∞ ⋅ ∣ ∣ x ∣ ∣ ∞ ∣ ∣ A x ∣ ∣ 2 ≤ ∣ ∣ A ∣ ∣ F ⋅ ∣ ∣ x ∣ ∣ 2 ||\boldsymbol{Ax}||_{1}\leq||\boldsymbol{A}||_{1}\cdot||\boldsymbol{x}||_{1}\\ ||\boldsymbol{Ax}||_{2}\leq||\boldsymbol{A}||_{2}\cdot||\boldsymbol{x}||_{2}\\ ||\boldsymbol{Ax}||_{\infty}\leq||\boldsymbol{A}||_{\infty}\cdot||\boldsymbol{x}||_{\infty}\\ ||\boldsymbol{Ax}||_{2}\leq||\boldsymbol{A}||_{F}\cdot||\boldsymbol{x}||_{2}\\ ∣∣Ax∣∣1≤∣∣A∣∣1⋅∣∣x∣∣1∣∣Ax∣∣2≤∣∣A∣∣2⋅∣∣x∣∣2∣∣Ax∣∣∞≤∣∣A∣∣∞⋅∣∣x∣∣∞∣∣Ax∣∣2≤∣∣A∣∣F⋅∣∣x∣∣2
6.2 条件数
6.2.1 条件数的推导
对一般的非奇异线性方程组 A x = b \boldsymbol{Ax=b} Ax=b,在 x \boldsymbol{x} x和 b \boldsymbol{b} b两端分别增加扰动 δ b \boldsymbol{\delta b} δb和 δ x \boldsymbol{\delta x} δx,有:
A ( x + δ x ) = b + δ b A(x+\delta x) = b + \delta b A(x+δx)=b+δb
带入 A x = b \boldsymbol{Ax=b} Ax=b,得到:
δ x = A − 1 δ b \delta x = A^{-1}\delta b δx=A−1δb
两边同时取范数,根据相容性得:
∥ δ x ∥ ≤ ∥ A − 1 ∥ ⋅ ∥ δ b ∥ \left\lVert \delta x \right\lVert \leq \left\lVert A^{-1} \right\lVert\cdot \left\lVert \delta b \right\lVert ∥δx∥≤ A−1 ⋅∥δb∥
对 A x = b \boldsymbol{Ax=b} Ax=b两边取范数,得到:
∥ b ∥ ≤ ∥ A ∥ ⋅ ∥ x ∥ \left\lVert b \right\lVert \leq \left\lVert A \right\lVert\cdot \left\lVert x \right\lVert ∥b∥≤∥A∥⋅∥x∥
1 ∥ x ∥ ≤ ∥ A ∥ ⋅ 1 ∥ b ∥ \frac{1}{\left\lVert x \right\lVert} \leq \left\lVert A \right\lVert\cdot \frac{1}{\left\lVert b \right\lVert} ∥x∥1≤∥A∥⋅∥b∥1
两边分别相乘:
∥ δ x ∥ ∥ x ∥ ≤ ∥ A ∥ ∥ A − 1 ∥ ∥ δ b ∥ ∥ b ∥ \frac{\left\lVert \delta x \right\lVert}{\left\lVert x \right\lVert} \leq \left\lVert A \right\lVert\left\lVert A^{-1} \right\lVert \frac{\left\lVert \delta b \right\lVert}{\left\lVert b \right\lVert} ∥x∥∥δx∥≤∥A∥ A−1 ∥b∥∥δb∥
式中这个 ∥ A ∥ ∥ A − 1 ∥ \left\lVert A \right\lVert\left\lVert A^{-1} \right\lVert ∥A∥ A−1 玩意起到一个放大倍数的作用,对方程组得解的相对误差起到关键性作用。
6.2.2 定义
若 A \boldsymbol{A} A为n阶非奇异矩阵,称 c o n d ( A ) = ∣ ∣ A ∣ ∣ ⋅ ∣ ∣ A − 1 ∣ ∣ cond(\boldsymbol{A})=||\boldsymbol{A}||\cdot||\boldsymbol{A^{-1}}|| cond(A)=∣∣A∣∣⋅∣∣A−1∣∣为条件数
6.2.3 分类
常用的有:
cond(A,inf)
:
c o n d ∞ ( A ) = ∣ ∣ A ∣ ∣ ∞ ⋅ ∣ ∣ A − 1 ∣ ∣ ∞ cond_{\infty}(A)=||A||_{\infty}\cdot||A^{-1}||_{\infty} cond∞(A)=∣∣A∣∣∞⋅∣∣A−1∣∣∞
cond(A,2)
:
c o n d 2 ( A ) = ∣ ∣ A ∣ ∣ 2 ⋅ ∣ ∣ A − 1 ∣ ∣ 2 cond_{2}(A)=||A||_{2}\cdot||A^{-1}||_{2} cond2(A)=∣∣A∣∣2⋅∣∣A−1∣∣2
- 如果A是对称正定矩阵,有:
c o n d 2 ( A ) = λ 1 ( A ) λ n ( A ) = A 特征值最大值 A 特征值最小值 cond_{2}(A)=\frac{\lambda_{1}(A)}{\lambda_{n}(A)}=\frac{A特征值最大值}{A特征值最小值} cond2(A)=λn(A)λ1(A)=A特征值最小值A特征值最大值
证明:
A x = λ x = > A A x = A λ x = λ A x = λ 2 x Ax=\lambda x => AAx=A\lambda x=\lambda Ax=\lambda^{2}x Ax=λx=>AAx=Aλx=λAx=λ2x
A ⋅ A T = A 2 = > i f λ 1 2 > . . . > λ n 2 A\cdot A^{T}=A^{2} => if\;\lambda_{1}^{2}>...>\lambda_{n}^{2} A⋅AT=A2=>ifλ12>...>λn2
∣ ∣ A ⋅ A T ∣ ∣ 2 = ∥ A ∥ 2 = ρ ( A T A ) = λ 1 ∣ ∣ A − 1 ⋅ ( A T ) − 1 ∣ ∣ 2 = ∥ A ∥ 2 = ρ ( A T A ) = 1 λ n ||A\cdot A^{T}||_{2}=\lVert A\rVert_2 =\sqrt{\rho(A^TA)}=\lambda_{1}\\ ||A^{-1}\cdot (A^{T})^{-1}||_{2}=\lVert A \rVert_2 =\sqrt{\rho(A^TA)}=\frac{1}{\lambda_{n}} ∣∣A⋅AT∣∣2=∥A∥2=ρ(ATA) =λ1∣∣A−1⋅(AT)−1∣∣2=∥A∥2=ρ(ATA) =λn1
6.2.4 性质
- 对于任意的n阶非奇异矩阵A, c o n d ( A ) ≥ 1 cond(A)\geq1 cond(A)≥1
∣ ∣ A ∣ ∣ ⋅ ∣ ∣ A − 1 ∣ ∣ ≥ ∣ ∣ A ⋅ A − 1 ∣ ∣ = ∣ ∣ I ∣ ∣ = 1 ||A||\cdot||A^{-1}|| \geq ||A\cdot A^{-1}||= ||I||=1 ∣∣A∣∣⋅∣∣A−1∣∣≥∣∣A⋅A−1∣∣=∣∣I∣∣=1
- 对于任意的n阶非奇异矩阵A和常数c, c o n d ( c A ) = c o n d ( A ) cond(cA) =cond(A) cond(cA)=cond(A)
- 对于任意正交矩阵P, c o n d 2 ( P ) = 1 cond_{2}(P)=1 cond2(P)=1,且 c o n d 2 ( A P ) = c o n d 2 ( P A ) = c o n d 2 ( A ) cond_{2}(AP)=cond_{2}(PA)=cond_{2}(A) cond2(AP)=cond2(PA)=cond2(A)
选用常数和正交矩阵,无法改善条件数
6.2.5 作用
条件数越小,表示这个矩阵式良态矩阵,反之,为病态矩阵。
6.3 基本迭代法
定常迭代法的迭代矩阵通常保持不变
6.3.1 basic concept
- 本质:find a matrix, derive the origin matrix to a new form
o r i g i n : A x = b ↓ A = M − n M x = N x + b , M = 非奇异 x = B x + g = { B = M − 1 N g = M − 1 b ↓ n e w f o r m : x ( k + 1 ) = B ⋅ x ( k ) + g origin:Ax=b\\ \downarrow\\ A=M-n\\ Mx=Nx+b,M=非奇异\\ x=Bx+g=\begin{cases} B & = M^{-1}N \\ g&=M^{-1}b \end{cases}\\ \downarrow\\ new form:x^{(k+1)}=B\cdot x^{(k)}+g origin:Ax=b↓A=M−nMx=Nx+b,M=非奇异x=Bx+g={Bg=M−1N=M−1b↓newform:x(k+1)=B⋅x(k)+g
- 定义:我们希望 M M M和 N N N具有特殊性质,因此,定义对角矩阵 D D D、上三角矩阵 L L L和下三角矩阵 U U U:
A = D − L − U D = d i a g ( a 11 , a 22 , . . . , a n n ) L = − ( 0 a 21 0 ⋮ ⋱ a n 1 a n 2 ⋯ 0 ) U = − ( 0 a 12 ⋯ a 1 n 0 ⋯ a 2 n ⋱ ⋮ 0 ) A=D-L-U\\ D=diag(a_{11},a_{22},...,a_{nn})\\ L=-\begin{pmatrix} 0\\ a_{21} &0\\ \vdots & &\ddots\\ a_{n1} & a_{n2}&\cdots &0 \end{pmatrix}\\ U=-\begin{pmatrix} 0 & a_{12} & \cdots & a_{1n} \\ &0 & \cdots &a_{2n} \\ & & \ddots & \vdots \\ & & &0 \end{pmatrix} A=D−L−UD=diag(a11,a22,...,ann)L=− 0a21⋮an10an2⋱⋯0 U=− 0a120⋯⋯⋱a1na2n⋮0
6.3.2 Jacobi
- 本质:
x ( k + 1 ) = D − 1 ( L + U ) x ( k ) + D − 1 b x^{(k+1)}=D^{-1}(L+U)x^{(k)}+D^{-1}b x(k+1)=D−1(L+U)x(k)+D−1b
- 推导:
{ a 11 x 1 + a 12 x 2 + ⋯ a 1 n x n = b 1 a 21 x 1 + a 22 x 2 + ⋯ a 2 n x n = b 2 ⋯ a n 1 x 1 + a n 2 x 2 + ⋯ a n n x n = b n ⇔ x = M x + g \left\{\begin{array}{ll} a_{11} x_1+a_{12} x_2+\cdots a_{1 n} x_n=b_1 \\ a_{21} x_1+a_{22} x_2+\cdots a_{2 n} x_n=b_2 \\ \cdots \\ a_{n 1} x_1+a_{n 2} x_2+\cdots a_{n n} x_n=b_n \end{array} \quad \Leftrightarrow x=M x+g\right.\\ ⎩ ⎨ ⎧a11x1+a12x2+⋯a1nxn=b1a21x1+a22x2+⋯a2nxn=b2⋯an1x1+an2x2+⋯annxn=bn⇔x=Mx+g
⇓ { x 1 = − a 12 a 11 x 2 − ⋯ − a 1 n a 11 x n + b 1 a 11 x 2 = − a 21 a 22 x 1 − ⋯ − a 2 n a 22 x n + b 2 a 22 ⋯ x n = − a n 1 a n n x 1 − a n 2 a n n x 2 − ⋯ − a n n − 1 a n n x n − 1 + b n a n n \Downarrow\\ \left\{\begin{array}{l} x_1=-\frac{a_{12}}{a_{11}} x_2-\cdots-\frac{a_{1 n}}{a_{11}} x_n+\frac{b_1}{a_{11}} \\ x_2=-\frac{a_{21}}{a_{22}} x_1-\cdots-\frac{a_{2 n}}{a_{22}} x_n+\frac{b_2}{a_{22}} \\ \quad \cdots \\ x_n=-\frac{a_{n 1}}{a_{n n}} x_1-\frac{a_{n 2}}{a_{n n}} x_2-\cdots-\frac{a_{n n-1}}{a_{n n}} x_{n-1}+\frac{b_n}{a_{n n}} \end{array}\right. ⇓⎩ ⎨ ⎧x1=−a11a12x2−⋯−a11a1nxn+a11b1x2=−a22a21x1−⋯−a22a2nxn+a22b2⋯xn=−annan1x1−annan2x2−⋯−annann−1xn−1+annbn
⇓ x i ( k + 1 ) = b i − ∑ j = 1 i − 1 a i j x j ( k ) − ∑ j = i + 1 n a i j x j ( k ) a i i ; i = 1 , 2 , ⋯ , n \Downarrow\\ x_i^{(k+1)}=\frac{b_i-\sum_{j=1}^{i-1} a_{i j} x_j^{(k)}-\sum_{j=i+1}^n a_{i j} x_j^{(k)}}{a_{i i}} ; i=1,2, \cdots, n ⇓xi(k+1)=aiibi−∑j=1i−1aijxj(k)−∑j=i+1naijxj(k);i=1,2,⋯,n
- 例题一道
写一个Jacobi迭代程序,输入维数n,求解Ax=b,其中:
U = − ( n + 1 1 ⋯ 1 1 n + 2 ⋯ 1 ⋮ ⋱ ⋮ 1 1 ⋯ 2 n ) , b = ( 1 2 ⋮ n ) U=-\begin{pmatrix} n+1 & 1 & \cdots & 1 \\ 1 & n+2 & \cdots & 1 \\ \vdots & & \ddots & \vdots \\ 1& 1 & \cdots&2n \end{pmatrix}\quad ,b=\begin{pmatrix}1\\2\\ \vdots\\n\end{pmatrix} U=− n+11⋮11n+21⋯⋯⋱⋯11⋮2n ,b= 12⋮n
function [x,iteration]=jacobi(n,tol)
% n为维数,tol为误差
A=ones(n,n);
D=diag([n+1:2*n]); %n*n
L=-tril(A,-1); %n*n
U=-triu(A,1); % n*n
A=D-L-U;
b=[1:n]'; %n*1
x=zeros(size(b)) % initialize x
for iteration=1:2000
x=D\(L*x+U*x+b) % remember divide by'\' no '/'
error=norm(b-A*x,2)/norm(b,2) % 2-norm to calculate error
if error
- notes
简单记忆:向左倒是左除\,向右倒是右除/
/:右除:a/b表示矩阵a乘以矩阵b的逆
\:左除:a\b表示矩阵a的逆乘以b。
6.3.3 Gauss-Seidel
- 本质:
x ( k + 1 ) = ( D − L ) − 1 U x ( k ) + ( D − L ) − 1 b x^{(k+1)}=(D-L)^{-1}Ux^{(k)}+(D-L)^{-1}b x(k+1)=(D−L)−1Ux(k)+(D−L)−1b
- 上一次迭代出来的x,会继承给下一次迭代
a i i ⋅ x i ( k + 1 ) + ∑ j = 1 i − 1 a i j x j ( k + 1 ) = ( b i − ∑ j = i + 1 n a i j x j ( k ) ) , i = 1 , 2 , … , n a_{i i} \cdot x_i^{(k+1)}+\sum_{j=1}^{i-1} a_{i j} x_j^{(k+1)}=\left(b_i-\sum_{j=i+1}^n a_{i j} x_j^{(k)}\right), \quad i=1,2, \ldots, n aii⋅xi(k+1)+j=1∑i−1aijxj(k+1)=(bi−j=i+1∑naijxj(k)),i=1,2,…,n
⇓ ( D − L ) ⋅ x = U x + b \Downarrow\\ (D-L)\cdot x=Ux+b ⇓(D−L)⋅x=Ux+b
⇓ x i ( k + 1 ) = 1 a i i ( b i − ∑ j = 1 i − 1 a i j x j ( k + 1 ) − ∑ j = i + 1 n a i j x j ( k ) ) , i = 1 , 2 , … , n \Downarrow\\ x_i^{(k+1)}=\frac{1}{a_{i i}}\left(b_i-\sum_{j=1}^{i-1} a_{i j} x_j^{(k+1)}-\sum_{j=i+1}^n a_{i j} x_j^{(k)}\right), \quad i=1,2, \ldots, n ⇓xi(k+1)=aii1(bi−j=1∑i−1aijxj(k+1)−j=i+1∑naijxj(k)),i=1,2,…,n
- 一道例题
用高斯赛德尔迭代求解Ax=b,其中:
A = ( 20 4 6 4 20 8 6 8 20 ) , b = ( 10 − 24 − 22 ) A=\begin{pmatrix} 20&4&6\\ 4&20&8\\ 6&8&20 \end{pmatrix}\quad ,b=\begin{pmatrix}10\\-24\\-22\end{pmatrix} A= 204642086820 ,b= 10−24−22
A=[20 4 6;4 20 8;6 8 20];
b=[10 -24 -22]';
[xgs,itergx]=gs(A,b,5e-5)
%% guess-seidel
function [x,iter]=gs(A,b,tol)
D=diag(diag(A));
L=-tril(A,-1);
U=-triu(A,1);
x=zeros(size(b));
for iter=1:2000
x=(D-L)\(U*x+b); % D->(D-L)
error=norm(b-A*x)/norm(b);
if error
6.3.4 SOR:Successive Over Relaxation
- 本质:上一次迭代结果+残差修正量
x ( k + 1 ) = x ( k ) + ω ⋅ D − 1 ( L x ( k + 1 ) + U x ( k ) − D x ( k ) + b ) x^{(k+1)}=x^{(k)}+\omega\cdot D^{-1}(Lx^{(k+1)}+Ux^{(k)}-Dx^{(k)}+b) x(k+1)=x(k)+ω⋅D−1(Lx(k+1)+Ux(k)−Dx(k)+b)
- 推导:高斯赛德尔迭代->分离 x ( K ) x^{(K)} x(K),然后对剩余项乘以一个松弛因子
x i ( k + 1 ) = 1 a i i ( b i − ∑ j = 1 i − 1 a i j x j ( k + 1 ) − ∑ j = i + 1 n a i j x j ( k ) ) , i = 1 , 2 , … , n x_i^{(k+1)}=\frac{1}{a_{i i}}\left(b_i-\sum_{j=1}^{i-1} a_{i j} x_j^{(k+1)}-\sum_{j=i+1}^n a_{i j} x_j^{(k)}\right), \quad i=1,2, \ldots, n xi(k+1)=aii1(bi−j=1∑i−1aijxj(k+1)−j=i+1∑naijxj(k)),i=1,2,…,n
⇓ x i ( k + 1 ) = x i ( k ) + 1 a i i r i ( k ) r i ( k ) = ( b i − ∑ j = 1 i − 1 a i j x j ( k + 1 ) − ∑ j = i n a i j x j ( k ) ) , i = 1 , 2 , ⋯ , n \Downarrow\\ \mathrm{x}_{\mathrm{i}}^{(\mathrm{k}+1)}=\mathrm{x}_{\mathrm{i}}^{(\mathrm{k})}+\frac{1}{\mathrm{a}_{\mathrm{ii}}} \mathrm{r}_{\mathrm{i}}^{(\mathrm{k})} \\ \mathrm{r}_{\mathrm{i}}^{(\mathrm{k})}=\left(\mathrm{b}_{\mathrm{i}}-\sum_{\mathrm{j}=1}^{\mathrm{i}-1} \mathrm{a}_{\mathrm{ij}} \mathrm{x}_{\mathrm{j}}^{(\mathrm{k}+1)}-\sum_{\mathrm{j}=\mathrm{i}}^{\mathrm{n}} \mathrm{a}_{\mathrm{ij}} \mathrm{x}_{\mathrm{j}}^{(\mathrm{k})}\right), \quad \mathrm{i}=1,2, \cdots, \mathrm{n} ⇓xi(k+1)=xi(k)+aii1ri(k)ri(k)= bi−j=1∑i−1aijxj(k+1)−j=i∑naijxj(k) ,i=1,2,⋯,n
⇓ x i ( k + 1 ) = ( 1 − ω ) x i ( k ) + ω a i i ( b i − ∑ j = 1 i − 1 a i j x j ( k + 1 ) − ∑ j = i + 1 n a i j x j ( k ) ) , i = 1 , 2 , ⋯ , n ( 2.5 ) \Downarrow\\ \mathrm{x}_{\mathrm{i}}^{(\mathrm{k}+1)}=(1-\omega) \mathrm{x}_{\mathrm{i}}^{(\mathrm{k})}+\frac{\omega}{\mathrm{a}_{\mathrm{ii}}}\left(\mathrm{b}_{\mathrm{i}}-\sum_{\mathrm{j}=1}^{\mathrm{i}-1} \mathrm{a}_{\mathrm{ij}} \mathrm{x}_{\mathrm{j}}^{(\mathrm{k}+1)}-\sum_{\mathrm{j}=\mathrm{i}+1}^{\mathrm{n}} \mathrm{a}_{\mathrm{ij}} \mathrm{x}_{\mathrm{j}}^{(\mathrm{k})}\right), \quad \mathrm{i}=1,2, \cdots, \mathrm{n}(2.5) ⇓xi(k+1)=(1−ω)xi(k)+aiiω bi−j=1∑i−1aijxj(k+1)−j=i+1∑naijxj(k) ,i=1,2,⋯,n(2.5)
- 一道例题
用超松弛迭代求解Ax=b,其中:
A = ( 20 4 6 4 20 8 6 8 20 ) , b = ( 10 − 24 − 22 ) A=\begin{pmatrix} 20&4&6\\ 4&20&8\\ 6&8&20 \end{pmatrix}\quad ,b=\begin{pmatrix}10\\-24\\-22\end{pmatrix} A= 204642086820 ,b= 10−24−22
function [x,iter]=sor(A,b,omega,tol)
D=diag(diag(A));
L=-tril(A,-1);
U=-triu(A,1);
x=zeros(size(b));
for iter=1:2000
x=(D-omega*L)\((1-omega)*D*x+omega*U*x+omega*b);
error=norm(b-A*x)/norm(b);
if error
- 特点
- 高斯赛德尔迭代是 ω = 1 \omega =1 ω=1时的SOR迭代
- 采用松弛因子是为了加快迭代方法收敛的速度
6.4 迭代法收敛性和误差分析
6.4.1 可约矩阵
- 排列矩阵:
- 每行每列仅有唯一的非0元1的方阵P
- 对任意矩阵A左乘一个排列矩阵P,就是对矩阵A的行进行排列。
- 对任意矩阵A右乘一个排列矩阵P,就是对矩阵A的列进行排列。
P = ( 0 1 0 1 0 0 0 0 1 ) P=\begin{pmatrix} 0&1&0\\1&0&0\\0&0&1 \end{pmatrix} P= 010100001
- 可约矩阵:设A是n阶矩阵,如果存在n阶排列矩阵P,满足:
P T A P = ( A 11 A 12 0 A 22 ) P^{T}AP=\begin{pmatrix} A_{11}&A_{12}\\0&A_{22} \end{pmatrix} PTAP=(A110A12A22)
其中, A 11 A_{11} A11和 A 22 A_{22} A22分别为r阶和n-r阶的方阵,则称为可约矩阵;若不存在,则称为不可约矩阵。
- 应用:类似于高斯消元,将x和b也进行类似转换,可以得到,从而加快运算
{ A 11 y 1 + A 12 y 2 = f 1 A 11 y 2 = f 2 \begin{cases} A_{11}y_1+A_{12}y_2=f_1\\ \;\;\;\;\;\;\;\;\;\;\;\;\;A_{11}y_2=f_2\\ \end{cases} {A11y1+A12y2=f1A11y2=f2
6.4.2 对角占优
- 若n阶矩阵A满足:
∣ a i i ∣ > ∑ j ≠ i ∣ a i j ∣ , i = 1 , ⋯ , n . \left|a_{i i}\right|>\sum_{j \neq i}\left|a_{i j}\right|, \quad i=1, \cdots, n . ∣aii∣>j=i∑∣aij∣,i=1,⋯,n.
- 至少满足一个:弱对角占优
- 全部满足:严格对角占优
- 都不满足:不对角占优
- 例题一道
判断下列矩阵属于什么占优?
( 5 3 2 − 2 4 2 6 1 8 ) \begin{pmatrix} 5&3&2\\-2&4&2\\6&1&8 \end{pmatrix} 5−26341228
5=|3|+|2|;
4=|-2|+|2|;
8>|6|+|1|;
因此,这是一个弱对角占优
- 引理一:若A是严格对角占优OR不可约弱对角占优,则A是可逆矩阵(非奇异矩阵==det|A|≠0)
反证法,若 d e t ∣ A ∣ = 0 det|A|=0 det∣A∣=0,则对 A x = 0 Ax=0 Ax=0有非零解 x = ( x 1 , ⋯ , x n ) x=(x_1,\cdots,x_n) x=(x1,⋯,xn)
令 x r x_r xr为最大的一个,则 ∣ x r ∣ > 0 |x_r|>0 ∣xr∣>0,不妨令 ∣ x r ∣ = 1 |x_r|=1 ∣xr∣=1,取第r个方程:
∑ j = 1 n a r j x j = 0 \sum_{j=1}^{n} a_{rj}x_j=0 j=1∑narjxj=0
− a r r x r = ∑ j = 1 , j ≠ r n a r j x j -a_{rr}x_r=\sum_{j=1,j≠r}^{n} a_{rj}x_j −arrxr=j=1,j=r∑narjxj
∣ a r r ∣ ∣ x r ∣ ≤ ∑ j = 1 , j ≠ r n ∣ a r j ∣ ∣ x j ∣ |a_{rr}||x_r|\leq\sum_{j=1,j≠r}^{n} |a_{rj}||x_j| ∣arr∣∣xr∣≤j=1,j=r∑n∣arj∣∣xj∣
∣ a r r ∣ ≤ ∑ j = 1 , j ≠ r n ∣ a r j ∣ ∣ x j ∣ ≤ ∑ j = 1 , j ≠ r n ∣ a r j ∣ |a_{rr}|\leq\sum_{j=1,j≠r}^{n} |a_{rj}||x_j|\leq\sum_{j=1,j≠r}^{n} |a_{rj}| ∣arr∣≤j=1,j=r∑n∣arj∣∣xj∣≤j=1,j=r∑n∣arj∣
因此,A可逆
- 引理二:n阶矩阵A的k次幂 A k → 0 ( k → ∞ ) \boldsymbol{A^k\rightarrow0(k\rightarrow\infty)} Ak→0(k→∞) ⇔ \Leftrightarrow ⇔ 谱半径 ρ ( A ) < 1 \boldsymbol{\rho(A)<1} ρ(A)<1
6.4.3 迭代法基本定理
- 定理一: ρ ( A ) ≤ ∣ ∣ A ∣ ∣ \boldsymbol{\rho(A)\leq||A||} ρ(A)≤∣∣A∣∣,其中,A表示任一与某一向量范数相容的矩阵范数
- 定理二:对于基本迭代格式,给定初值 x ( 0 ) x^{(0)} x(0), x ∗ x^* x∗是真值,有下列收敛结果和误差估计
- 迭代格式收敛的充要条件: ρ ( B ) < 1 \boldsymbol{\rho(B)<1} ρ(B)<1
- 如果 ∣ ∣ B ∣ ∣ < 1 ||B||<1 ∣∣B∣∣<1,则有:
∥ x ( k ) − x ∗ ∥ ⩽ ∥ B ∥ k 1 − ∥ B ∥ ∥ x ( 1 ) − x ( 0 ) ∥ ∥ x ( k ) − x ∗ ∥ ⩽ ∥ B ∥ 1 − ∥ B ∥ ∥ ( k ) − x ( k − 1 ) ∥ \begin{aligned} & \left\|x^{(k)}-x_*\right\| \leqslant \frac{\|B\|^k}{1-\|B\|}\left\|x^{(1)}-x^{(0)}\right\| \\ & \left\|x^{(k)}-x_*\right\| \leqslant \frac{\|B\|}{1-\|B\|}\left\|^{(k)}-x^{(k-1)}\right\| \end{aligned} x(k)−x∗ ⩽1−∥B∥∥B∥k x(1)−x(0) x(k)−x∗ ⩽1−∥B∥∥B∥ (k)−x(k−1)
proof1:
x ( k ) − x ∗ = B ( x ( k − 1 ) − x ∗ ) = ⋯ = B k ( x ( 0 ) − x ∗ ) ↓ x ( 0 ) − x ∗ = c o n s t ↓ 由引理二: w h e n k → ∞ , B k → 0 ↔ ρ ( B ) < 1 x^{(k)}-x^*=B(x^{(k-1)}-x^*)=\cdots=B^{k}(x^{(0)}-x^*)\\ \downarrow\\ x^{(0)}-x^*=const\\ \downarrow\\ 由引理二:when\;k\rightarrow\infty, \;B^k\rightarrow0 \leftrightarrow \rho(B)<1 x(k)−x∗=B(x(k−1)−x∗)=⋯=Bk(x(0)−x∗)↓x(0)−x∗=const↓由引理二:whenk→∞,Bk→0↔ρ(B)<1
proof2: 运用矩阵范数的三角不等式原理
x ( k ) − x ∗ = x ( k ) − x ( k + 1 ) + x ( k + 1 ) − x ∗ = [ ( B x ( k − 1 ) + f ) − ( B x ( k ) + f ) ] + [ ( B x ( k ) + f ) − ( B x ∗ + f ) ] = B ( x ( k − 1 ) − x ( k ) ) + B ( x ( k ) − x ∗ ) ↓ ∥ x ( k ) − x ( k + 1 ) + x ( k + 1 ) − x ∗ ∥ ⩽ ∥ B x ( k − 1 ) + f − ( B x ( k ) + f ) ∥ + ∥ ( B x ( k ) + f ) − ( B x ∗ + f ) ∥ ⩽ ∥ B ∥ ∥ x ( k − 1 ) − x ( k ) ∥ + ∥ B ∥ ∥ x ( k ) − x ∗ ∥ ↓ [ 1 − ∥ B ∥ ) ∥ x ( k ) − x ∗ ∥ ⩽ ∥ B ∥ ∥ x ( k − 1 ) − x ( k ) ∥ ↓ ∥ x ( k ) − x A ∥ ⩽ ∥ B ∥ ( 1 − ∥ B ∥ ) ∥ x ( k − 1 ) − x ( k ) ∥ \begin{aligned} x^{(k)}-x^* &=x^{(k)}-x^{(k+1)}+x^{(k+1)}-x^*\\ &=[(B x^{(k-1)}+f)-(B x^{(k)}+f)]+[(Bx^{(k)}+f)-(Bx^*+f)]\\ &=B\left(x^{(k-1)}-x^{(k)}\right)+B\left(x^{(k)}-x^*\right) \end{aligned}\\ \downarrow\\ \begin{aligned} \left\|x^{(k)}-x^{(k+1)}+x^{(k+1)}-x^*\right\| & \leqslant\left\|B x^{(k-1)}+f-\left(B x^{(k)}+f\right)\right\|+\left\|\left(B x^{(k)}+f\right)-\left(B x^*+f\right)\right\| \\ & \leqslant\|B\|\left\|x^{(k-1)}-x^{(k)}\right\|+\|B\|\left\|x^{(k)}-x^*\right\| \end{aligned}\\ \downarrow\\ {[1-\|B\|)\left\|x^{(k)}-x^*\right\| \leqslant\|B\|\left\|x^{(k-1)}-x^{(k)}\right\|} \\ \downarrow\\ \left\|x^{(k)}-x^A\right\| \leqslant \frac{\|B\|}{(1-\|B\|)}\left\|x^{(k-1)}-x^{(k)}\right\| x(k)−x∗=x(k)−x(k+1)+x(k+1)−x∗=[(Bx(k−1)+f)−(Bx(k)+f)]+[(Bx(k)+f)−(Bx∗+f)]=B(x(k−1)−x(k))+B(x(k)−x∗)↓ x(k)−x(k+1)+x(k+1)−x∗ ⩽ Bx(k−1)+f−(Bx(k)+f) + (Bx(k)+f)−(Bx∗+f) ⩽∥B∥ x(k−1)−x(k) +∥B∥ x(k)−x∗ ↓[1−∥B∥) x(k)−x∗ ⩽∥B∥ x(k−1)−x(k) ↓ x(k)−xA ⩽(1−∥B∥)∥B∥ x(k−1)−x(k)
- 定理三:若A是严格对角占优OR不可约弱对角占优 ⇒ \Rightarrow ⇒ Jacobi和GS迭代法收敛
- 定理四:若A是对称正定矩阵,Jacobi收敛 ⇔ \Leftrightarrow ⇔ 2D-A也为正定对称
- 定理五:SOR迭代收敛 ⇒ \Rightarrow ⇒ 0 < ω < 2 \boldsymbol{0<\omega<2} 0<ω<2
B S O R = ( D − ω L ) − 1 [ ( 1 − ω ) D + ω U ] B_{SOR}=(D-\omega L)^{-1}[(1-\omega)D+\omega U] BSOR=(D−ωL)−1[(1−ω)D+ωU],其特征值为 λ 1 , ⋯ , λ n \lambda_{1},\cdots,\lambda_{n} λ1,⋯,λn
n个谱半径肯定大于各特征值之积,各特征值的积就是det(A)
( ρ ( B S O R ) ) ≥ ∣ λ 1 λ 2 ⋯ λ n ∣ = d e t ( B S O R ) = d e t ∣ 下三角矩阵 ∣ ⋅ d e t ∣ 上三角矩阵 ∣ = ∣ a 11 − 1 ⋯ a n n − 1 ∣ ∣ ( 1 − ω ) n a 11 ⋯ a n n ∣ = ( 1 − ω ) n < 1 \begin{aligned} (\rho(B_{SOR}))&\geq|\lambda_{1}\lambda_{2}\cdots\lambda_{n}|=det(B_{SOR})\\ &=det|下三角矩阵|\cdot det|上三角矩阵|\\ &=|a_{11}^{-1}\cdots a_{nn}^{-1}||(1-\omega)^na_{11}\cdots a_{nn}|\\ &=(1-\omega)^n <1 \end{aligned} (ρ(BSOR))≥∣λ1λ2⋯λn∣=det(BSOR)=det∣下三角矩阵∣⋅det∣上三角矩阵∣=∣a11−1⋯ann−1∣∣(1−ω)na11⋯ann∣=(1−ω)n<1
- 定理六:若A是对称正定矩阵,且 0 < ω < 2 \boldsymbol{0<\omega<2} 0<ω<2 ⇔ \Leftrightarrow ⇔ SOR迭代收敛
- 一道例题
设线性方程组Ax=b的系数矩阵为:
A = ( 1 a a a 1 a a a 1 ) A=\begin{pmatrix}1&a&a\\a&1&a\\a&a&1\end{pmatrix} A= 1aaa1aaa1
证明:
(1) 当-0.5 (2) 当-0.5 (3) 当a=0.8时,用Jacobi迭代求解发散
求解:
(1)由于不是对角占优,因此采用定理六,判断对称正定矩阵, ω = 1 \omega=1 ω=1
(2)定理三,充分条件,证明严格对角占优矩阵
(3)定理四,充要条件,证明2D-A不是对称正定矩阵
6.5 不定常迭代
6.5.1 特点
基于变分方法来最小化线性方程组的残量方法,没有明显的迭代矩阵
6.5.2 分类
- 最速下降法:求解对称正定线性方程组
- 共轭梯度法:求解对称正定线性方程组,本质上是一种变分的方法,即求二次函数的极值
- 广义极小残量法:求解不对称线性方程组