在线性最小二乘里,最后得到的闭式解为 ( A T A ) x = A T b ({A^T}A)x = {A^T}b (ATA)x=ATb,对于该等式的求解,其中一种方法是正交变换法求解,而正交变换求解线性最小二乘的实质就是对矩阵 A A A进行QR分解。
QR分解的概念:若实非奇异对称矩阵A能化为正交矩阵Q和实非奇异上三角矩阵R的乘积,即 A = Q R A=QR A=QR,则称该式为A的QR分解。
事实上,任何的实非奇异n阶矩阵A都可以分解成正交矩阵Q和上三角矩阵R的乘积,这可以通过施密特正交化来证明。
证明过程如下:
对于实非奇异n阶矩阵A的各个列向量依次为 α 1 , α 2 , ⋯ α n {\alpha _1},{\alpha _2}, \cdots {\alpha _n} α1,α2,⋯αn,那么,通过施密特正交化得到的n个标准正交向量如下,
{ β 1 = b 11 α 1 β 2 = b 12 α 1 + b 22 α 2 ⋮ β n = b 1 n α 1 + b 2 n α 2 + ⋯ + b n n α n \left\{ {\begin{array}{l} {{\beta _1} = {b_{11}}{\alpha _1}}\\ {{\beta _2} = {b_{12}}{\alpha _1} + {b_{22}}{\alpha _2}}\\ \vdots \\ {{\beta _n} = {b_{1n}}{\alpha _1} + {b_{2n}}{\alpha _2} + \cdots + {b_{nn}}{\alpha _n}} \end{array}} \right. ⎩⎪⎪⎪⎨⎪⎪⎪⎧β1=b11α1β2=b12α1+b22α2⋮βn=b1nα1+b2nα2+⋯+bnnαn
则,写成矩阵形式为,
( β 1 , β 2 , ⋯ β n ) = ( α 1 , α 2 , ⋯ α n ) B ({\beta _1},{\beta _2}, \cdots {\beta _n}) = ({\alpha _1},{\alpha _2}, \cdots {\alpha _n})B (β1,β2,⋯βn)=(α1,α2,⋯αn)B
即,
Q = A B Q=AB Q=AB
其中, B = [ b 11 b 12 ⋯ b 1 n b 22 ⋯ b 2 n ⋱ ⋮ b n n ] B = \left[ {\begin{array}{l} {{b_{11}}}&{{b_{12}}}& \cdots &{{b_{1n}}}\\ {}&{{b_{22}}}& \cdots &{{b_{2n}}}\\ {}&{}& \ddots & \vdots \\ {}&{}&{}&{{b_{nn}}} \end{array}} \right] B=⎣⎢⎢⎢⎡b11b12b22⋯⋯⋱b1nb2n⋮bnn⎦⎥⎥⎥⎤
很显然,矩阵B就是一个上三角矩阵,而且B可逆,QR分解中的R就是 R = B − 1 R = {B^{ - 1}} R=B−1, Q = [ β 1 , β 2 , ⋯ β n ] Q=[{\beta _1},{\beta _2}, \cdots {\beta _n}] Q=[β1,β2,⋯βn]是正交矩阵(施密特正交化的结果),则有,
A = Q B − 1 = Q R A=QB^{-1}=QR A=QB−1=QR
这就说明了QR分解的存在性。
从上述的QR分解存在性的证明过程可以看出,QR分解步骤的关键就是找到一个上三角矩阵R和一个正交矩阵Q。那么,怎么才能得到一个Q和R,正交变换就能做到这一点。所以,QR分解的方法就有三种,第一种就是上述证明过程所示的,用施密特正交化求解,其他两种是Givens变换(初等旋转变换)和Housholder(镜像变换)。
施密特正交化
设 α 1 , α 2 , ⋯ α n {\alpha _1},{\alpha _2}, \cdots {\alpha _n} α1,α2,⋯αn为一组线性无关的向量,施密特正交化后的向量为 β 1 , β 2 , ⋯ β n {\beta _1},{\beta _2}, \cdots {\beta _n} β1,β2,⋯βn,则施密特正交化的一般式如下,
{ β m = α m + k m 1 β m − 1 + k m 2 β m − 2 + ⋯ + k m , m − 1 β 1 k m i = − ( x m , β m − i ) ( β m − i , β m − i ) i = 1 , 2 , ⋯ m − 1 \left\{ {\begin{array}{l} {{\beta _m} = {\alpha _m} + {k_{m1}}{\beta _{m - 1}} + {k_{m2}}{\beta _{m - 2}} + \cdots + {k_{m,m - 1}}{\beta _1}}\\\\ {{k_{mi}} = - \frac{{({x_m},{\beta _{m - i}})}}{{({\beta _{m - i}},{\beta _{m - i}})}}{\kern 1pt} {\kern 1pt} {\kern 1pt} \;\;\;\;\;\;\;i = 1,2, \cdots m - 1} \end{array}} \right. ⎩⎨⎧βm=αm+km1βm−1+km2βm−2+⋯+km,m−1β1kmi=−(βm−i,βm−i)(xm,βm−i)i=1,2,⋯m−1
举个例子,对矩阵 A = [ 1 1 0 1 − 1 1 0 0 2 ] A = \left[ {\begin{array}{l} 1&1&0\\ 1&{ - 1}&1\\ 0&0&2 \end{array}} \right] A=⎣⎡1101−10012⎦⎤进行QR分解。
令 A = [ α 1 , α 2 , α 3 ] A=[\alpha_1,\alpha_2,\alpha_3] A=[α1,α2,α3],则 α 1 = ( 1 , 1 , 0 ) T , α 2 = ( 1 , − 1 , 0 ) T , α 1 = ( 0 , 1 , 2 ) T \alpha_1=(1,1,0)^T,\alpha_2=(1,-1,0)^T,\alpha_1=(0,1,2)^T α1=(1,1,0)T,α2=(1,−1,0)T,α1=(0,1,2)T,则根据上述的施密特方法得,
{ β 1 ′ = α 1 = ( 1 , 1 , 0 ) T β 2 ′ = α 2 − ( β 1 ′ , α 2 ) ( β 1 ′ , β 1 ′ ) β 1 ′ = ( 1 , − 1 , 0 ) T β 3 ′ = α 3 − ( β 1 ′ , α 3 ) ( β 1 ′ , β 1 ′ ) β 1 ′ − ( β 2 ′ , α 3 ) ( β 2 ′ , β 2 ′ ) β 2 ′ = ( 0 , 0 , 2 ) T \left\{ {\begin{array}{l} {\beta _1^{'} = {\alpha _1} = {{(1,1,0)}^T}}\\\\ {\beta _2^{'} = {\alpha _2} - \frac{{(\beta _1^{'},{\alpha _2})}}{{(\beta _1^{'},\beta _1^{'})}}\beta _1^{'} = {{(1, - 1,0)}^T}}\\\\ {\beta _3^{'} = {\alpha _3} - \frac{{(\beta _1^{'},{\alpha _3})}}{{(\beta _1^{'},\beta _1^{'})}}\beta _1^{'} - \frac{{(\beta _2^{'},{\alpha _3})}}{{(\beta _2^{'},\beta _2^{'})}}\beta _2^{'} = {{(0,0,2)}^T}} \end{array}} \right. ⎩⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎧β1′=α1=(1,1,0)Tβ2′=α2−(β1′,β1′)(β1′,α2)β1′=(1,−1,0)Tβ3′=α3−(β1′,β1′)(β1′,α3)β1′−(β2′,β2′)(β2′,α3)β2′=(0,0,2)T
单位化后得,
{ β 1 = ( 1 2 , 1 2 , 0 ) T = 1 2 α 1 β 2 = ( 1 2 , − 1 2 , 0 ) T = 1 2 α 2 β 3 = ( 0 , 0 , 1 ) T = − 1 4 α 1 + 1 4 α 2 + 1 2 α 3 \left\{ {\begin{array}{l} {{\beta _1} = {{(\frac{1}{{\sqrt 2 }},\frac{1}{{\sqrt 2 }},0)}^T} = \frac{1}{{\sqrt 2 }}{\alpha _1}}\\\\ {{\beta _2} = {{(\frac{1}{{\sqrt 2 }}, - \frac{1}{{\sqrt 2 }},0)}^T} = \frac{1}{{\sqrt 2 }}{\alpha _2}}\\\\ {{\beta _3} = {{(0,0,1)}^T} = - \frac{1}{4}{\alpha _1} + \frac{1}{4}{\alpha _2} + \frac{1}{2}{\alpha _3}} \end{array}} \right. ⎩⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎧β1=(21,21,0)T=21α1β2=(21,−21,0)T=21α2β3=(0,0,1)T=−41α1+41α2+21α3
上式写成矩阵形式,
( β 1 , β 2 , β 3 ) ⏟ Q = ( α 1 , α 2 , α 3 ) ⏟ A [ 1 2 0 − 1 4 0 1 2 1 4 0 0 1 2 ] \underbrace {({\beta _1},{\beta _2},{\beta _3})}_Q = \underbrace {({\alpha _1},{\alpha _2},{\alpha _3})}_A\left[ {\begin{array}{l} {\frac{1}{{\sqrt 2 }}}&0&{ - \frac{1}{4}}\\\\ 0&{\frac{1}{{\sqrt 2 }}}&{\frac{1}{4}}\\\\ 0&0&{\frac{1}{2}} \end{array}} \right] Q (β1,β2,β3)=A (α1,α2,α3)⎣⎢⎢⎢⎢⎡21000210−414121⎦⎥⎥⎥⎥⎤
A = ( β 1 , β 2 , β 3 ) [ 1 2 0 − 1 4 0 1 2 1 4 0 0 1 2 ] − 1 = [ 1 2 1 2 0 1 2 − 1 2 0 0 0 1 ] [ 2 0 1 2 0 2 − 1 2 0 0 2 ] = Q R \begin{array}{l} A = ({\beta _1},{\beta _2},{\beta _3}){\left[ {\begin{array}{l} {\frac{1}{{\sqrt 2 }}}&0&{ - \frac{1}{4}}\\\\ 0&{\frac{1}{{\sqrt 2 }}}&{\frac{1}{4}}\\\\ 0&0&{\frac{1}{2}} \end{array}} \right]^{ - 1}}\\\\ \;{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} = \left[ {\begin{array}{l} {\frac{1}{{\sqrt 2 }}}&{\frac{1}{{\sqrt 2 }}}&0\\\\ {\frac{1}{{\sqrt 2 }}}&{ - \frac{1}{{\sqrt 2 }}}&0\\\\ 0&0&1 \end{array}} \right]\left[ {\begin{array}{l} {\sqrt 2 }&0&{\frac{1}{{\sqrt 2 }}}\\\\ 0&{\sqrt 2 }&{ - \frac{1}{{\sqrt 2 }}}\\\\ 0&0&2 \end{array}} \right]\\\\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} = QR \end{array} A=(β1,β2,β3)⎣⎢⎢⎢⎢⎡21000210−414121⎦⎥⎥⎥⎥⎤−1=⎣⎢⎢⎢⎢⎡2121021−210001⎦⎥⎥⎥⎥⎤⎣⎢⎢⎢⎢⎡20002021−212⎦⎥⎥⎥⎥⎤=QR
从上例可以看出,矩阵的QR实际上就是施密特正交化的特性,施密特正交化可以得到一个上三角矩阵和一个正交矩阵,所以矩阵可以QR分解。
Givens变换法
Givens变换,即初等旋转变换,正交变换的一种,对应的初等旋转矩阵形式如下,
R i j = [ 1 ⋱ 1 cos θ 0 ⋯ 0 sin θ 0 1 ⋮ ⋮ ⋮ ⋮ 0 0 ⋯ 1 0 − sin θ 0 ⋯ 0 cos θ 1 ⋱ 1 ] ← i 行 ← j 行 {R_{ij}} = \left[ {\begin{array}{l} 1&{}&{}&{}&{}&{}&{}&{}&{}&{}&{}\\ {}& \ddots &{}&{}&{}&{}&{}&{}&{}&{}&{}\\ {}&{}&1&{}&{}&{}&{}&{}&{}&{}&{}\\ {}&{}&{}&{\cos \theta }&0& \cdots &0&{\sin \theta }&{}&{}&{}\\ {}&{}&{}&0&1&{}&{}&{}&{}&{}&{}\\ {}&{}&{}& \vdots & \vdots &{}& \vdots & \vdots &{}&{}&{}\\ {}&{}&{}&0&0& \cdots &1&0&{}&{}&{}\\ {}&{}&{}&{ - \sin \theta }&0& \cdots &0&{\cos \theta }&{}&{}&{}\\ {}&{}&{}&{}&{}&{}&{}&{}&1&{}&{}\\ {}&{}&{}&{}&{}&{}&{}&{}&{}& \ddots &{}\\ {}&{}&{}&{}&{}&{}&{}&{}&{}&{}&1 \end{array}} \right]\begin{array}{l} {}\\ {}\\ {}\\ { \leftarrow i行}\\ {}\\ {}\\ {}\\ { \leftarrow j行}\\ {}\\ {}\\ {} \end{array} Rij=⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡1⋱1cosθ0⋮0−sinθ01⋮00⋯⋯⋯0⋮10sinθ⋮0cosθ1⋱1⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤←i行←j行
初等变换矩阵在矩阵化简方面有很好的应用,即,若以 R i j R_{ij} Rij左乘一个矩阵A,并且其中的s和c按如下方式取值,
s = x j x i 2 + x j 2 , c = x i x i 2 + x j 2 s = \frac{{{x_j}}}{{\sqrt {x_i^2 + x_j^{\rm{2}}} }},c = \frac{{{x_i}}}{{\sqrt {x_i^2 + x_j^{\rm{2}}} }} s=xi2+xj2xj,c=xi2+xj2xi则就可以把矩阵A的第j个分量化为0。
举个例子,把向量 x = ( 1 , 2 , 3 ) T x=(1,2,3)^T x=(1,2,3)T通过Givens变换化成与 ( 1 , 0 , 0 ) T (1,0,0)^T (1,0,0)T同方向。
第一步,可以选择把向量x中的第二项化为0,则,
c = x 1 x 1 2 + x 2 2 = 1 5 s = x 2 x 1 2 + x 2 2 = 2 5 \begin{array}{l} c = \frac{{{x_1}}}{{\sqrt {x_1^2 + x_2^{\rm{2}}} }} = \frac{1}{{\sqrt 5 }}\\\\ s = \frac{{{x_2}}}{{\sqrt {x_1^2 + x_2^{\rm{2}}} }} = \frac{2}{{\sqrt 5 }} \end{array} c=x12+x22x1=51s=x12+x22x2=52
得到初等旋转矩阵为,
R 12 = [ 1 5 2 5 0 − 2 5 1 5 0 0 0 1 ] {R_{12}} = \left[ {\begin{array}{l} {\frac{1}{{\sqrt 5 }}}&{\frac{2}{{\sqrt 5 }}}&0\\\\ { - \frac{2}{{\sqrt 5 }}}&{\frac{1}{{\sqrt 5 }}}&0\\\\ 0&0&1 \end{array}} \right] R12=⎣⎢⎢⎢⎢⎡51−52052510001⎦⎥⎥⎥⎥⎤
从而把x的第二项化为0,即,
y = R 12 x = ( 5 , 0 , 3 ) T y = {R_{12}}x = {(\sqrt 5 ,0,3)^T} y=R12x=(5,0,3)T
第二步,把y中的第3项化为0,同样,取c和s为,
c = x 1 x 1 2 + x 3 2 = 5 14 s = x 3 x 1 2 + x 3 2 = 3 14 \begin{array}{l} {c = \frac{{{x_1}}}{{\sqrt {x_1^2 + x_3^{\rm{2}}} }} = \sqrt {\frac{5}{{14}}} }\\ {}\\ {s = \frac{{{x_3}}}{{\sqrt {x_1^2 + x_3^{\rm{2}}} }} = \frac{3}{{\sqrt {14} }}} \end{array} c=x12+x32x1=145s=x12+x32x3=143
得到初等旋转矩阵为,
R 13 = [ 5 14 0 3 14 0 1 0 − 3 14 0 5 14 ] {R_{13}} = \left[ {\begin{array}{l} {\sqrt {\frac{5}{{14}}} }&0&{\frac{3}{{\sqrt {14} }}}\\ {}&{}&{}\\ 0&1&0\\ {}&{}&{}\\ { - \frac{3}{{\sqrt {14} }}}&0&{\sqrt {\frac{5}{{14}}} } \end{array}} \right] R13=⎣⎢⎢⎢⎢⎢⎢⎡1450−1430101430145⎦⎥⎥⎥⎥⎥⎥⎤
这里就把第三项化为0,即,
z = R 13 y = R 13 R 12 x = ( 14 , 0 , 0 ) T z = {R_{13}}y = {R_{13}}{R_{12}}x = {\left( {\sqrt {14} ,0,0} \right)^T} z=R13y=R13R12x=(14,0,0)T
对一个向量可以这样化简,那么对于矩阵而言,矩阵可以看作是由n个列向量组成的,所以把矩阵A左乘一系列的初等旋转矩阵,就可以把A化简为结构较为简单的上三角矩阵R。而前面已经证明了一个实非奇异的矩阵A可以进行QR分解,那么这里的Givens变换就能把一个矩阵A化成上三角矩阵,所以,Givens变换可以用于QR分解。
R n − 1 , n ⋯ R 2 n ⋯ R 23 R 1 n ⋯ R 12 A = R Q = ( R n − 1 , n ⋯ R 2 n ⋯ R 23 R 1 n ⋯ R 12 ) − 1 = ( R n − 1 , n ⋯ R 2 n ⋯ R 23 R 1 n ⋯ R 12 ) H \begin{array}{l} {R_{n - 1,n}} \cdots {R_{2n}} \cdots {R_{23}}{R_{1n}} \cdots {R_{12}}A = R\\\\ Q = {({R_{n - 1,n}} \cdots {R_{2n}} \cdots {R_{23}}{R_{1n}} \cdots {R_{12}})^{ - 1}}\\\\ \;{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\rm{ = }}{({R_{n - 1,n}} \cdots {R_{2n}} \cdots {R_{23}}{R_{1n}} \cdots {R_{12}})^H} \end{array} Rn−1,n⋯R2n⋯R23R1n⋯R12A=RQ=(Rn−1,n⋯R2n⋯R23R1n⋯R12)−1=(Rn−1,n⋯R2n⋯R23R1n⋯R12)H
举个例子,用Givens方法对矩阵 A = [ 12 − 20 41 9 − 15 − 63 20 50 35 ] A = \left[ {\begin{array}{l} {12}&{ - 20}&{41}\\\\ 9&{ - 15}&{ - 63}\\\\ {20}&{50}&{35} \end{array}} \right] A=⎣⎢⎢⎢⎢⎡12920−20−155041−6335⎦⎥⎥⎥⎥⎤进行QR分解。
第一步,用 R 12 R_{12} R12左乘A,消去第1列第2行处的元素,
c = 12 12 2 + 9 2 = 12 15 = 4 5 , s = 9 12 2 + 9 2 = 3 5 c = \frac{{12}}{{\sqrt {{{12}^2} + {9^2}} }} = \frac{{12}}{{15}} = \frac{4}{5},s = \frac{9}{{\sqrt {{{12}^2} + {9^2}} }} = \frac{3}{5} c=122+9212=1512=54,s=122+929=53故,
R 12 A = [ 4 5 3 5 0 − 3 5 4 5 0 0 0 1 ] A = [ 15 − 25 − 5 0 0 − 75 20 50 35 ] {R_{12}}A = \left[ {\begin{array}{l} {\frac{4}{5}}&{\frac{3}{5}}&0\\\\ { - \frac{3}{5}}&{\frac{4}{5}}&0\\\\ 0&0&1 \end{array}} \right]A = \left[ {\begin{array}{l} {15}&{ - 25}&{ - 5}\\\\ 0&0&{ - 75}\\\\ {20}&{50}&{35} \end{array}} \right] R12A=⎣⎢⎢⎢⎢⎡54−53053540001⎦⎥⎥⎥⎥⎤A=⎣⎢⎢⎢⎢⎡15020−25050−5−7535⎦⎥⎥⎥⎥⎤
同理,接着消去A中第1列第3个元素,
c = 15 15 2 + 20 2 = 15 25 = 3 5 , s = 20 15 2 + 20 2 = 4 5 c = \frac{{15}}{{\sqrt {{{15}^2} + {{20}^2}} }} = \frac{{15}}{{25}} = \frac{3}{5},s = \frac{{20}}{{\sqrt {{{15}^2} + {{20}^2}} }} = \frac{4}{5} c=152+20215=2515=53,s=152+20220=54故
A ( 1 ) = R 13 R 12 A = [ 3 5 0 4 5 0 1 0 − 4 5 0 3 5 ] R 12 A = [ 25 25 25 0 0 − 75 0 50 25 ] {A^{(1)}} = {R_{13}}{R_{12}}A = \left[ {\begin{array}{l} {\frac{3}{5}}&0&{\frac{4}{5}}\\\\ 0&1&0\\\\ { - \frac{4}{5}}&0&{\frac{3}{5}} \end{array}} \right]{R_{12}}A = \left[ {\begin{array}{l} {25}&{25}&{25}\\\\ 0&0&{ - 75}\\\\ 0&{50}&{25} \end{array}} \right] A(1)=R13R12A=⎣⎢⎢⎢⎢⎡530−5401054053⎦⎥⎥⎥⎥⎤R12A=⎣⎢⎢⎢⎢⎡25002505025−7525⎦⎥⎥⎥⎥⎤
这里就把矩阵A中的第一列的第二个和第三个元素化为0,即第一列( A ( 1 ) A^{(1)} A(1))化简完成,下面开始第二列的化简。
第二步,进行第二列的化简,即把第二列的第三个元素化为0,
c = 0 0 2 + 50 2 = 0 , s = 50 0 2 + 50 2 = 1 c = \frac{0}{{\sqrt {{0^2} + {{50}^2}} }} = 0,s = \frac{{50}}{{\sqrt {{0^2} + {{50}^2}} }} = 1 c=02+5020=0,s=02+50250=1故,
R 23 = [ 1 0 0 0 0 1 0 − 1 0 ] {R_{23}} = \left[ {\begin{array}{l} 1&0&0\\ 0&0&1\\ 0&{ - 1}&0 \end{array}} \right] R23=⎣⎡10000−1010⎦⎤
A ( 2 ) = R 23 A ( 1 ) = [ 25 25 25 0 50 25 0 0 75 ] {A^{(2)}} = {R_{23}}{A^{(1)}} = \left[ {\begin{array}{l} {25}&{25}&{25}\\\\ 0&{50}&{25}\\\\ 0&0&{75} \end{array}} \right] A(2)=R23A(1)=⎣⎢⎢⎢⎢⎡250025500252575⎦⎥⎥⎥⎥⎤
这里的 A ( 2 ) A^{(2)} A(2)就是上三角矩阵,即QR分解里的R,而Q的求解如下,
Q = ( R 23 R 13 R 12 ) − 1 = [ 12 25 9 25 20 25 − 16 25 − 12 25 15 25 15 25 − 20 25 0 ] − 1 = [ 12 25 9 25 20 25 − 16 25 − 12 25 15 25 15 25 − 20 25 0 ] H = [ 12 25 − 16 25 15 25 9 25 − 12 25 − 20 25 20 25 15 25 0 ] \begin{array}{l} Q = {({R_{23}}{R_{13}}{R_{12}})^{ - 1}}\\\\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} = {\left[ {\begin{array}{l} {\frac{{12}}{{25}}}&{\frac{9}{{25}}}&{\frac{{20}}{{25}}}\\\\ { - \frac{{16}}{{25}}}&{ - \frac{{12}}{{25}}}&{\frac{{15}}{{25}}}\\\\ {\frac{{15}}{{25}}}&{ - \frac{{20}}{{25}}}&0 \end{array}} \right]^{ - 1}}\\\\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} = {\left[ {\begin{array}{l} {\frac{{12}}{{25}}}&{\frac{9}{{25}}}&{\frac{{20}}{{25}}}\\\\ { - \frac{{16}}{{25}}}&{ - \frac{{12}}{{25}}}&{\frac{{15}}{{25}}}\\\\ {\frac{{15}}{{25}}}&{ - \frac{{20}}{{25}}}&0 \end{array}} \right]^H}\\\\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} = \left[ {\begin{array}{l} {\frac{{12}}{{25}}}&{ - \frac{{16}}{{25}}}&{\frac{{15}}{{25}}}\\\\ {\frac{9}{{25}}}&{ - \frac{{12}}{{25}}}&{ - \frac{{20}}{{25}}}\\\\ {\frac{{20}}{{25}}}&{\frac{{15}}{{25}}}&0 \end{array}} \right] \end{array} Q=(R23R13R12)−1=⎣⎢⎢⎢⎢⎡2512−25162515259−2512−2520252025150⎦⎥⎥⎥⎥⎤−1=⎣⎢⎢⎢⎢⎡2512−25162515259−2512−2520252025150⎦⎥⎥⎥⎥⎤H=⎣⎢⎢⎢⎢⎡25122592520−2516−251225152515−25200⎦⎥⎥⎥⎥⎤
从上例可以看出,Givens方法需要作 n ( n − 1 ) 2 \frac{{n(n - 1)}}{2} 2n(n−1)个初等旋转矩阵的乘积,当 n n n比较大时,计算量较大。
Housholder变换法
Housholder变换,即镜像变换,正交变换的一种,对应的初等反射矩阵形式如下, H = I − 2 ω ω T H = I - 2\omega {\omega ^T} H=I−2ωωT其中 ω \omega ω表示某单位向量。
其镜面关系为:若 η = H ξ \eta = H\xi η=Hξ,则向量 η \eta η和向量 ξ \xi ξ关于某单位向量 ω \omega ω的正交轴 l l l镜像对称。
用初等反射矩阵也可以完成对矩阵的简化,其简化原理用的是如下定理:
镜像变换可以使任何非零向量 ξ \xi ξ变成与给定单位向量 a a a同方向的向量 η \eta η。
用这一定理可以构造出H里单位向量 ω \omega ω如下,
ω = ξ − ∣ ξ ∣ a ∣ ξ − ∣ ξ ∣ a ∣ \omega = \frac{{\xi - \left| \xi \right|a}}{{\left| {\xi - \left| \xi \right|a} \right|}} ω=∣ξ−∣ξ∣a∣ξ−∣ξ∣a那么,可以通过构造 ω \omega ω的方法来简化矩阵A,使其变为一个上三角阵。
举个例子,用镜像变化把向量 ξ = ( 0 , 3 , 0 , 4 ) T \xi=(0,3,0,4)^T ξ=(0,3,0,4)T变为与向量 e 1 = ( 1 , 0 , 0 , 0 ) T e_1=(1,0,0,0)^T e1=(1,0,0,0)T同方向的向量。
首先就是构造 ω \omega ω,
ω = ξ − ∣ ξ ∣ e 1 ∣ ξ − ∣ ξ ∣ e 1 ∣ = ( − 5 , 3 , 0 , 4 ) T ∣ ( − 5 , 3 , 0 , 4 ) T ∣ = ( − 1 2 , 3 5 2 , 0 , 4 5 2 ) T \omega = \frac{{\xi - \left| \xi \right|{e_1}}}{{\left| {\xi - \left| \xi \right|{e_1}} \right|}} = \frac{{{{( - 5,3,0,4)}^T}}}{{\left| {{{( - 5,3,0,4)}^T}} \right|}} = {( - \frac{1}{{\sqrt 2 }},\frac{3}{{5\sqrt 2 }},0,\frac{4}{{5\sqrt 2 }})^T} ω=∣ξ−∣ξ∣e1∣ξ−∣ξ∣e1=∣∣∣(−5,3,0,4)T∣∣∣(−5,3,0,4)T=(−21,523,0,524)T则初等反射矩阵H为,
H = I − 2 ω ω T = [ 0 3 5 0 4 5 3 5 16 25 0 − 12 25 0 0 1 0 4 5 − 12 25 0 9 25 ] H = I - 2\omega {\omega ^T} = \left[ {\begin{array}{l} 0&{\frac{3}{5}}&0&{\frac{4}{5}}\\\\ {\frac{3}{5}}&{\frac{{16}}{{25}}}&0&{ - \frac{{12}}{{25}}}\\\\ 0&0&1&0\\\\ {\frac{4}{5}}&{ - \frac{{12}}{{25}}}&0&{\frac{9}{{25}}} \end{array}} \right] H=I−2ωωT=⎣⎢⎢⎢⎢⎢⎢⎢⎢⎡0530545325160−2512001054−25120259⎦⎥⎥⎥⎥⎥⎥⎥⎥⎤
从而有,
η = H ξ = ( 5 , 0 , 0 , 0 ) T \eta = H\xi = {(5,0,0,0)^T} η=Hξ=(5,0,0,0)T
上例,是使用镜像变换完成了对向量的化简。那么,对于矩阵而言,同样把矩阵看成是一组列向量的组合,那么每次求解只完成一列列向量的简化。
而与Given变换不同的是,对于每一列列向量的简化,其对应的反射矩阵H只要求解一次,所以,经过 n − 1 n-1 n−1次左乘初等反射矩阵H就可以化成一个上三角阵,计算量大约是Givens变换的一半,即,
H ( n − 1 ) H ( n − 2 ) ⋯ H ( 1 ) A = A ( n ) = R Q = ( H ( n − 1 ) H ( n − 2 ) ⋯ H ( 1 ) ) − 1 \begin{array}{l} {H^{(n - 1)}}{H^{(n - 2)}} \cdots {H^{(1)}}A = {A^{(n)}} = R\\\\ Q = {({H^{(n - 1)}}{H^{(n - 2)}} \cdots {H^{(1)}})^{ - 1}} \end{array} H(n−1)H(n−2)⋯H(1)A=A(n)=RQ=(H(n−1)H(n−2)⋯H(1))−1
举个例子,用Housholder变换法对矩阵 A = [ 2 2 1 1 2 2 2 1 2 ] A = \left[ {\begin{array}{l} 2&2&1\\ 1&2&2\\ 2&1&2 \end{array}} \right] A=⎣⎡212221122⎦⎤进行QR分解。
第一步,完成对第一列的简化。
因为 α 1 ( 1 ) = ( 2 , 1 , 2 ) T ≠ 0 \alpha _1^{(1)} = {(2,1,2)^T} \ne 0 α1(1)=(2,1,2)T=0,作单位向量,
ω ( 1 ) = α 1 ( 1 ) − ∣ α 1 ( 1 ) ∣ e 1 ∣ α 1 ( 1 ) − ∣ α 1 ( 1 ) ∣ e 1 ∣ = ( 2 , 1 , 2 ) T − 3 e 1 ∣ ( 2 , 1 , 2 ) T − 3 e 1 ∣ = ( − 1 6 , 1 6 , 2 6 ) T {\omega ^{(1)}} = \frac{{\alpha _1^{(1)} - \left| {\alpha _1^{(1)}} \right|{e_1}}}{{\left| {\alpha _1^{(1)} - \left| {\alpha _1^{(1)}} \right|{e_1}} \right|}} = \frac{{{{(2,1,2)}^T} - 3{e_1}}}{{\left| {{{(2,1,2)}^T} - 3{e_1}} \right|}} = {( - \frac{1}{{\sqrt 6 }},\frac{1}{{\sqrt 6 }},\frac{2}{{\sqrt 6 }})^T} ω(1)=∣∣∣α1(1)−∣∣∣α1(1)∣∣∣e1∣∣∣α1(1)−∣∣∣α1(1)∣∣∣e1=∣∣∣(2,1,2)T−3e1∣∣∣(2,1,2)T−3e1=(−61,61,62)T则反射矩阵H为,
H ( 1 ) = I − 2 ω ( 1 ) ω ( 1 ) T = [ 2 3 1 3 2 3 1 3 2 3 − 2 3 2 3 − 2 3 − 1 3 ] {H^{(1)}} = I - 2{\omega ^{(1)}}{\omega ^{(1)T}} = \left[ {\begin{array}{l} {\frac{2}{3}}&{\frac{1}{3}}&{\frac{2}{3}}\\\\ {\frac{1}{3}}&{\frac{2}{3}}&{ - \frac{2}{3}}\\\\ {\frac{2}{3}}&{ - \frac{2}{3}}&{ - \frac{1}{3}} \end{array}} \right] H(1)=I−2ω(1)ω(1)T=⎣⎢⎢⎢⎢⎡3231323132−3232−32−31⎦⎥⎥⎥⎥⎤从而得,
H ( 1 ) A = [ 3 8 3 8 3 0 4 3 1 3 0 − 1 3 − 4 3 ] = [ 3 8 3 8 3 0 0 A 2 ] {H^{(1)}}A = \left[ {\begin{array}{l} 3&{\frac{8}{3}}&{\frac{8}{3}}\\\\ 0&{\frac{4}{3}}&{\frac{1}{3}}\\\\ 0&{ - \frac{1}{3}}&{ - \frac{4}{3}} \end{array}} \right] = \left[ {\begin{array}{l} 3&{\begin{array}{l} {\frac{8}{3}}&{\frac{8}{3}} \end{array}}\\\\ {\begin{array}{l} 0\\\\ 0 \end{array}}&{{A_2}} \end{array}} \right] H(1)A=⎣⎢⎢⎢⎢⎡3003834−313831−34⎦⎥⎥⎥⎥⎤=⎣⎢⎢⎢⎢⎡3003838A2⎦⎥⎥⎥⎥⎤
第二步,简化第二列。此时关注的不是矩阵 H ( 1 ) A H^{(1)}A H(1)A,而是 A 2 A_2 A2。
则,
ω ^ ( 2 ) = ( 4 3 , − 1 3 ) T − ∣ ( 4 3 , − 1 3 ) T ∣ e 1 ∣ ( 4 3 , − 1 3 ) T − ∣ ( 4 3 , − 1 3 ) T ∣ e 1 ∣ = ( 4 − 17 34 − 8 17 , − 1 34 − 8 17 ) T {\hat \omega ^{(2)}} = \frac{{{{(\frac{4}{3}, - \frac{1}{3})}^T} - \left| {{{(\frac{4}{3}, - \frac{1}{3})}^T}} \right|{e_1}}}{{\left| {{{(\frac{4}{3}, - \frac{1}{3})}^T} - \left| {{{(\frac{4}{3}, - \frac{1}{3})}^T}} \right|{e_1}} \right|}} = {\left( {\frac{{4 - \sqrt {17} }}{{\sqrt {34 - 8\sqrt {17} } }}, - \frac{1}{{\sqrt {34 - 8\sqrt {17} } }}} \right)^T} ω^(2)=∣∣∣(34,−31)T−∣∣∣(34,−31)T∣∣∣e1∣∣∣(34,−31)T−∣∣∣(34,−31)T∣∣∣e1=(34−8174−17,−34−8171)T
那么有,
H ^ ( 2 ) = I − 2 ω ^ ( 2 ) ω ^ ( 2 ) T = [ 1 − 2 ( 4 − 17 ) 2 34 − 8 17 2 ( 4 − 17 ) 34 − 8 17 2 ( 4 − 17 ) 34 − 8 17 1 − 2 34 − 8 17 ] {\hat H^{(2)}} = I - 2{\hat \omega ^{(2)}}{\hat \omega ^{(2)T}} = \left[ {\begin{array}{l} {1 - \frac{{2{{(4 - \sqrt {17} )}^2}}}{{34 - 8\sqrt {17} }}}&{\frac{{2(4 - \sqrt {17} )}}{{34 - 8\sqrt {17} }}}\\\\ {\frac{{2(4 - \sqrt {17} )}}{{34 - 8\sqrt {17} }}}&{1 - \frac{2}{{34 - 8\sqrt {17} }}} \end{array}} \right] H^(2)=I−2ω^(2)ω^(2)T=⎣⎢⎡1−34−8172(4−17)234−8172(4−17)34−8172(4−17)1−34−8172⎦⎥⎤从而对应的反射矩阵为,
H ( 2 ) = [ 1 0 0 H ^ ( 2 ) ] = [ 1 0 0 0 1 − 2 ( 4 − 17 ) 2 34 − 8 17 2 ( 4 − 17 ) 34 − 8 17 0 2 ( 4 − 17 ) 34 − 8 17 1 − 2 34 − 8 17 ] {H^{(2)}} = \left[ {\begin{array}{l} 1&0\\\\ 0&{{{\hat H}^{(2)}}} \end{array}} \right] = \left[ {\begin{array}{l} 1&0&0\\\\ 0&{1 - \frac{{2{{(4 - \sqrt {17} )}^2}}}{{34 - 8\sqrt {17} }}}&{\frac{{2(4 - \sqrt {17} )}}{{34 - 8\sqrt {17} }}}\\\\ 0&{\frac{{2(4 - \sqrt {17} )}}{{34 - 8\sqrt {17} }}}&{1 - \frac{2}{{34 - 8\sqrt {17} }}} \end{array}} \right] H(2)=⎣⎡100H^(2)⎦⎤=⎣⎢⎢⎢⎢⎢⎡10001−34−8172(4−17)234−8172(4−17)034−8172(4−17)1−34−8172⎦⎥⎥⎥⎥⎥⎤则有,
H ( 2 ) H ( 1 ) A = [ 3 8 3 8 3 0 17 3 8 17 51 0 0 5 17 17 ] = R {H^{(2)}}{H^{(1)}}A = \left[ {\begin{array}{l} 3&{\frac{8}{3}}&{\frac{8}{3}}\\\\ 0&{\frac{{\sqrt {17} }}{3}}&{\frac{{8\sqrt {17} }}{{51}}}\\\\ 0&0&{\frac{{5\sqrt {17} }}{{17}}} \end{array}} \right] = R H(2)H(1)A=⎣⎢⎢⎢⎢⎡300383170385181717517⎦⎥⎥⎥⎥⎤=R
Q的求解如下,
Q = ( H ( 2 ) H ( 1 ) ) − 1 = [ 2 3 2 17 51 − 3 17 17 1 3 10 17 51 2 17 17 2 3 − 7 17 17 2 17 17 ] Q = {({H^{(2)}}{H^{(1)}})^{ - 1}} = \left[ {\begin{array}{l} {\frac{2}{3}}&{\frac{{2\sqrt {17} }}{{51}}}&{ - \frac{{3\sqrt {17} }}{{17}}}\\\\ {\frac{1}{3}}&{\frac{{10\sqrt {17} }}{{51}}}&{\frac{{2\sqrt {17} }}{{17}}}\\\\ {\frac{2}{3}}&{ - \frac{{7\sqrt {17} }}{{17}}}&{\frac{{2\sqrt {17} }}{{17}}} \end{array}} \right] Q=(H(2)H(1))−1=⎣⎢⎢⎢⎢⎢⎡32313251217511017−17717−173171721717217⎦⎥⎥⎥⎥⎥⎤
附:
从实用性角度考虑,一般不用施密特正交化进行QR分解,用Givens变换法和Housholder变换法较多,但Givens变换法需要 n ( n − 1 ) 2 \frac{{n(n - 1)}}{2} 2n(n−1)次左乘初等旋转矩阵,计算量大,而Housholder变换法只需 n − 1 n-1 n−1次左乘初等反射矩阵,计算量小是Housholder变换法的优势。但在稀疏矩阵的QR分解上,用Givens方法求解仍有方便之处。