设 A A A是 n × n n\times n n×n的实矩阵,
A x ‾ = A ˉ x ˉ = A x ˉ \overline{Ax}=\bar{A}\bar{x}=A\bar{x} Ax=Aˉxˉ=Axˉ
假设 λ \lambda λ是 A A A的特征值, x x x为 λ \lambda λ对应的特征向量,则 λ ˉ \bar{\lambda} λˉ同样是 A A A的特征值,而 x ˉ \bar{x} xˉ是对应的特征向量,
A x ˉ = A x ‾ = λ x ‾ = λ ˉ x ˉ A\bar{x}=\overline{Ax}=\overline{\lambda x}=\bar{\lambda}\bar{x} Axˉ=Ax=λx=λˉxˉ
所以,当 A A A是 n × n n\times n n×n的实矩阵,它的复特征值以共轭复数对出现。
假如 a a a, b b b为实数,且不同时为0,则将下面的矩阵称为rotation-scaling matrix,
A = [ a − b b a ] (1) A=\begin{bmatrix} a & -b \\ b & a \end{bmatrix} \tag{1} A=[ab−ba](1)
则有,
A A A的特征值为 λ = a ± i b \lambda=a\pm ib λ=a±ib。
首先我们假定下面的记号,
R e ( a + b i ) = a I m ( a + b i ) = b Re ( x + y i z + w i ) = ( x z ) Im ( x + y i z + w i ) = ( y w ) (3) \begin{aligned} Re(a + bi) = a \\ Im(a + bi) = b \\ \operatorname{Re}\left(\begin{array}{l} x+y i \\ z+w i \end{array}\right)=\left(\begin{array}{l} x \\ z \end{array}\right) \\ \operatorname{Im}\left(\begin{array}{l} x+y i \\ z+w i \end{array}\right)=\left(\begin{array}{l} y \\ w \end{array}\right) \end{aligned} \tag{3} Re(a+bi)=aIm(a+bi)=bRe(x+yiz+wi)=(xz)Im(x+yiz+wi)=(yw)(3)
这里首先讨论的矩阵是 2 × 2 2\times2 2×2的实矩阵,且矩阵有复特征值 λ \lambda λ,而与特征值相对应的特征向量为 v v v,这时候有个很漂亮的结论 A = C B C − 1 A=CBC^{-1} A=CBC−1,其中
C = ( ∣ ∣ Re ( v ) Im ( v ) ∣ ∣ ) and B = ( Re ( λ ) Im ( λ ) − Im ( λ ) Re ( λ ) ) (4) C=\left(\begin{array}{cc} | & | \\ \operatorname{Re}(v) & \operatorname{Im}(v) \\ | & | \end{array}\right) \quad \text { and } \quad B=\left(\begin{array}{cc} \operatorname{Re}(\lambda) & \operatorname{Im}(\lambda) \\ -\operatorname{Im}(\lambda) & \operatorname{Re}(\lambda) \end{array}\right)\tag{4} C=⎝⎛∣Re(v)∣∣Im(v)∣⎠⎞ and B=(Re(λ)−Im(λ)Im(λ)Re(λ))(4)
其中 B B B矩阵为rotation-scaling matrix。
为了证明矩阵 A A A的分解公式成立,我们首先证明 C C C是可逆的,即 Re ( v ) \operatorname{Re}(v) Re(v)和 Im ( v ) \operatorname{Im}(v) Im(v)是线性无关的。用反证法,假设 Re ( v ) \operatorname{Re}(v) Re(v)和 Im ( v ) \operatorname{Im}(v) Im(v)是线性相关的,则存在 x , y x,y x,y,使得, x Re ( v ) + y Im ( v ) = 0 x\operatorname{Re}(v)+y\operatorname{Im}(v)=0 xRe(v)+yIm(v)=0,则
( y + i x ) v = ( y + i x ) ( Re ( v ) + i Im ( v ) ) = y Re ( v ) − x Im ( v ) + ( x Re ( v ) + y Im ( v ) ) i = y Re ( v ) − x Im ( v ) (5) \begin{aligned} (y+i x) v &=(y+i x)(\operatorname{Re}(v)+i \operatorname{Im}(v)) \\ &=y \operatorname{Re}(v)-x \operatorname{Im}(v)+(x \operatorname{Re}(v)+y \operatorname{Im}(v)) i \\ &=y \operatorname{Re}(v)-x \operatorname{Im}(v) \end{aligned}\tag{5} (y+ix)v=(y+ix)(Re(v)+iIm(v))=yRe(v)−xIm(v)+(xRe(v)+yIm(v))i=yRe(v)−xIm(v)(5)
( y + i x ) v (y+i x) v (y+ix)v依然是属于特征值 λ \lambda λ的特征向量,而从式(5)可以得到 ( y + i x ) v (y+i x) v (y+ix)v是个实向量,而对于一个实矩阵的实特征向量对应的特征值一定是实的,但是和 λ \lambda λ是复特征根矛盾,因此可证 Re ( v ) \operatorname{Re}(v) Re(v)和 Im ( v ) \operatorname{Im}(v) Im(v)是线性无关的。
此外,我们假设复特征值 λ = a + b i \lambda=a+bi λ=a+bi,同时对应的特征向量为 v = ( x + y i z + w i ) v=\begin{pmatrix} x+yi \\ z+wi \end{pmatrix} v=(x+yiz+wi),则有,
A v = λ v = ( a + b i ) ( x + y i z + w i ) = ( ( a x − b y ) + ( a y + b x ) i ( a z − b w ) + ( a w + b z ) i ) = ( a x − b y a z − b w ) + i ( a y + b x a w + b z ) (6) \begin{aligned} A v=\lambda v &=(a+b i)\left(\begin{array}{c} x+y i \\ z+w i \end{array}\right) \\ &=\left(\begin{array}{c} (a x-b y)+(a y+b x) i \\ (a z-b w)+(a w+b z) i \end{array}\right) \\ &=\left(\begin{array}{c} a x-b y \\ a z-b w \end{array}\right)+i\left(\begin{array}{c} a y+b x \\ a w+b z \end{array}\right) \end{aligned}\tag{6} Av=λv=(a+bi)(x+yiz+wi)=((ax−by)+(ay+bx)i(az−bw)+(aw+bz)i)=(ax−byaz−bw)+i(ay+bxaw+bz)(6)
同时,
A ( ( x z ) + i ( y w ) ) = A ( x z ) + i A ( y w ) = A Re ( v ) + i A Im ( v ) (7) A\left(\left(\begin{array}{l} x \\ z \end{array}\right)+i\left(\begin{array}{l} y \\ w \end{array}\right)\right)=A\left(\begin{array}{l} x \\ z \end{array}\right)+i A\left(\begin{array}{l} y \\ w \end{array}\right)=A \operatorname{Re}(v)+i A \operatorname{Im}(v)\tag{7} A((xz)+i(yw))=A(xz)+iA(yw)=ARe(v)+iAIm(v)(7)
比较式(6)和(7),可以得到,
ARe ( v ) = ( a x − b y a z − b w ) AIm ( v ) = ( a y + b x a w + b z ) (8) \operatorname{ARe}(v)=\left(\begin{array}{l} a x-b y \\ a z-b w \end{array}\right) \quad \operatorname{AIm}(v)=\left(\begin{array}{l} a y+b x \\ a w+b z \end{array}\right)\tag{8} ARe(v)=(ax−byaz−bw)AIm(v)=(ay+bxaw+bz)(8)
接下来我们计算 C B C − 1 Re ( v ) C B C^{-1} \operatorname{Re}(v) CBC−1Re(v),和 C B C − 1 Im ( v ) C B C^{-1} \operatorname{Im}(v) CBC−1Im(v),由(4)式可以马上得到 C e 1 = Re ( v ) Ce_1=\operatorname{Re}(v) Ce1=Re(v), C e 2 = Im ( v ) Ce_2=\operatorname{Im}(v) Ce2=Im(v)(自然基取对应的列),则有
C B C − 1 Re ( v ) = C B e 1 = C ( a − b ) = a Re ( v ) − b Im ( v ) = a ( x z ) − b ( y w ) = ( a x − b y a z − b w ) = A Re ( v ) C B C − 1 Im ( v ) = C B e 2 = C ( b a ) = b Re ( v ) + a Im ( v ) = b ( x z ) + a ( y w ) = ( a y + b x a w + b z ) = A Im ( v ) (9) \begin{aligned} C B C^{-1} \operatorname{Re}(v) &=C B e_{1}=C\left(\begin{array}{c} a \\ -b \end{array}\right)=a \operatorname{Re}(v)-b \operatorname{Im}(v) \\ &=a\left(\begin{array}{l} x \\ z \end{array}\right)-b\left(\begin{array}{l} y \\ w \end{array}\right)=\left(\begin{array}{l} a x-b y \\ a z-b w \end{array}\right)=A \operatorname{Re}(v) \\ C B C^{-1} \operatorname{Im}(v) &=C B e_{2}=C\left(\begin{array}{l} b \\ a \end{array}\right)=b \operatorname{Re}(v)+a \operatorname{Im}(v) \\ &=b\left(\begin{array}{l} x \\ z \end{array}\right)+a\left(\begin{array}{l} y \\ w \end{array}\right)=\left(\begin{array}{c} a y+b x \\ a w+b z \end{array}\right)=A \operatorname{Im}(v) \end{aligned}\tag{9} CBC−1Re(v)CBC−1Im(v)=CBe1=C(a−b)=aRe(v)−bIm(v)=a(xz)−b(yw)=(ax−byaz−bw)=ARe(v)=CBe2=C(ba)=bRe(v)+aIm(v)=b(xz)+a(yw)=(ay+bxaw+bz)=AIm(v)(9)
因为 Re ( v ) \operatorname{Re}(v) Re(v)和 Im ( v ) \operatorname{Im}(v) Im(v)的线性无关的,可以组成 R 2 \mathbb{R}^2 R2的基,对于任意的向量 w w w, w = c Re ( v ) + d Im ( v ) w=c\operatorname{Re}(v)+d\operatorname{Im}(v) w=cRe(v)+dIm(v),则有,
A w = A ( cRe ( v ) + d Im ( v ) ) = cARe ( v ) + d A Im ( v ) = c C B C − 1 Re ( v ) + d C B C − 1 Im ( v ) = C B C − 1 ( c Re ( v ) + d Im ( v ) ) = C B C − 1 w (10) \begin{aligned} A w &=A(\operatorname{cRe}(v)+d \operatorname{Im}(v)) \\ &=\operatorname{cARe}(v)+d A \operatorname{Im}(v) \\ &=c C B C^{-1} \operatorname{Re}(v)+d C B C^{-1} \operatorname{Im}(v) \\ &=C B C^{-1}(c \operatorname{Re}(v)+d \operatorname{Im}(v)) \\ &=C B C^{-1} w \end{aligned}\tag{10} Aw=A(cRe(v)+dIm(v))=cARe(v)+dAIm(v)=cCBC−1Re(v)+dCBC−1Im(v)=CBC−1(cRe(v)+dIm(v))=CBC−1w(10)
因此 A = C B C − 1 A=C B C^{-1} A=CBC−1。
对于 A A A的带有rotation-scaling matrix的分解,我们可以这么理解, A A A中含有旋转和比例变换,矩阵 C C C提供了变量代换,如 x = C u x=Cu x=Cu。 A A A的作用相当于先将 x x x代换为 u u u,然后在 C C C所形成的基下利用 B B B矩阵进行旋转和缩放,旋转产生一个椭圆,然后将 u u u再变量代换回 x x x。注意,旋转是在 C C C所形成的基下,即顺着 Re ( v ) \operatorname{Re}(v) Re(v)和 Im ( v ) \operatorname{Im}(v) Im(v)所形成的基旋转。
对于 n × n n\times n n×n矩阵,都有类似上述 2 × 2 2\times 2 2×2矩阵的分解形式,下面以 3 × 3 3\times 3 3×3为列,如果矩阵 A A A有一个实的特征值 λ 2 \lambda_{2} λ2,一个复特征值 λ 1 \lambda_{1} λ1,则 λ 1 ‾ \overline{\lambda_{1}} λ1为另外一个复特征值, λ 2 \lambda_{2} λ2对应的实特征向量为 v 2 v_2 v2, λ 1 \lambda_{1} λ1对应的复特征向量为 v 1 v_1 v1,将 A A A分解为 A = C B C − 1 A=C B C^{-1} A=CBC−1,
C = ( ∣ ∣ ∣ Re ( v 1 ) Im ( v 1 ) v 2 ∣ ∣ ∣ ) B = ( Re ( λ 1 ) Im ( λ 1 ) 0 − Im ( λ 1 ) Re ( λ 1 ) 0 0 0 λ 2 ) (11) C=\left(\begin{array}{ccc} | & | & | \\ \operatorname{Re}\left(v_{1}\right) & \operatorname{Im}\left(v_{1}\right) & v_{2} \\ | & | & | \end{array}\right) \quad B=\left(\begin{array}{ccc} \operatorname{Re}\left(\lambda_{1}\right) & \operatorname{Im}\left(\lambda_{1}\right) & 0 \\ -\operatorname{Im}\left(\lambda_{1}\right) & \operatorname{Re}\left(\lambda_{1}\right) & 0 \\ \hline 0 & 0 & \lambda_{2} \end{array}\right)\tag{11} C=⎝⎛∣Re(v1)∣∣Im(v1)∣∣v2∣⎠⎞B=⎝⎛Re(λ1)−Im(λ1)0Im(λ1)Re(λ1)000λ2⎠⎞(11)
对于上述矩阵 A A A,在 R 3 \mathbb{R}^{3} R3中存在某个平面 A A A对平面的作用是旋转和缩放,该平面在 A A A的作用下是不变的。
举一个例子,例如,
A = [ 0.8 − 0.6 0 0.6 0.8 0 0 0 1.07 ] A=\begin{bmatrix} 0.8 & -0.6 & 0 \\ 0.6 & 0.8 & 0 \\ 0 & 0 & 1.07 \end{bmatrix} A=⎣⎡0.80.60−0.60.80001.07⎦⎤
上述矩阵 A A A与式(11)中的矩阵形式相同,如下图所示,对于 x 1 x 2 x_1x_2 x1x2平面(第三坐标为0)的任一向量 w 0 w_0 w0被 A A A旋转到该平面的另外一个位置上,不在该平面的任一向量 x 0 x_0 x0的第三坐标乘1.07。下图显示了 w 0 = ( 2 , 0 , 0 ) w_0=(2,0,0) w0=(2,0,0)和 x 0 = ( 2 , 0 , 1 ) x_0=(2,0,1) x0=(2,0,1)被 A A A作用的迭代结果, w 0 w_0 w0在 x 1 x 2 x_1x_2 x1x2平面旋转,而 x 0 x_0 x0在乘1.07后在旋转的同时也在盘旋上升。