坐标变换(8)—复特征值与旋转

1. 共轭复特征值

A A A n × n n\times n n×n的实矩阵,
A x ‾ = A ˉ x ˉ = A x ˉ \overline{Ax}=\bar{A}\bar{x}=A\bar{x} Ax=Aˉxˉ=Axˉ
假设 λ \lambda λ A A A的特征值, x x x λ \lambda λ对应的特征向量,则 λ ˉ \bar{\lambda} λˉ同样是 A A A的特征值,而 x ˉ \bar{x} xˉ是对应的特征向量,
A x ˉ = A x ‾ = λ x ‾ = λ ˉ x ˉ A\bar{x}=\overline{Ax}=\overline{\lambda x}=\bar{\lambda}\bar{x} Axˉ=Ax=λx=λˉxˉ

所以,当 A A A n × n n\times n n×n的实矩阵,它的复特征值以共轭复数对出现。

2. rotation-scaling matrix

假如 a a a, b b b为实数,且不同时为0,则将下面的矩阵称为rotation-scaling matrix
A = [ a − b b a ] (1) A=\begin{bmatrix} a & -b \\ b & a \end{bmatrix} \tag{1} A=[abba](1)
则有,

  1. A可以写成下面的旋转+缩放形式,
    A = [ a − b b a ] = [ r 0 0 r ] [ a r − b r b r a r ] = [ r 0 0 r ] [ cos ⁡ ( θ ) − sin ⁡ ( θ ) sin ⁡ ( θ ) cos ⁡ ( θ ) ] (2) \begin{aligned} A&=\begin{bmatrix} a & -b \\ b & a \end{bmatrix}\\ &=\begin{bmatrix} r &0 \\ 0 &r \\ \end{bmatrix} \begin{bmatrix} \frac{a}{r} & \frac{-b}{r}\\ \frac{b}{r}& \frac{a}{r} \end{bmatrix} \\ &=\begin{bmatrix} r &0 \\ 0 &r \\ \end{bmatrix} \begin{bmatrix} \cos(\theta)& -\sin(\theta)\\ \sin(\theta)& \cos(\theta) \end{bmatrix} \\ \end{aligned} \tag{2} A=[abba]=[r00r][rarbrbra]=[r00r][cos(θ)sin(θ)sin(θ)cos(θ)](2)
    其中, r = det ⁡ ( A ) = a 2 + b 2 r=\sqrt{\det(A)}=\sqrt{a^2+b^2} r=det(A) =a2+b2 ,则 A A A先旋转 θ \theta θ,再倍乘 r r r

A A A的特征值为 λ = a ± i b \lambda=a\pm ib λ=a±ib

坐标变换(8)—复特征值与旋转_第1张图片

3. 矩阵的复特征值

首先我们假定下面的记号,
R e ( a + b i ) = a I m ( a + b i ) = b Re ⁡ ( x + y i z + w i ) = ( x z ) Im ⁡ ( x + y i z + w i ) = ( y w ) (3) \begin{aligned} Re(a + bi) = a \\ Im(a + bi) = b \\ \operatorname{Re}\left(\begin{array}{l} x+y i \\ z+w i \end{array}\right)=\left(\begin{array}{l} x \\ z \end{array}\right) \\ \operatorname{Im}\left(\begin{array}{l} x+y i \\ z+w i \end{array}\right)=\left(\begin{array}{l} y \\ w \end{array}\right) \end{aligned} \tag{3} Re(a+bi)=aIm(a+bi)=bRe(x+yiz+wi)=(xz)Im(x+yiz+wi)=(yw)(3)

这里首先讨论的矩阵是 2 × 2 2\times2 2×2的实矩阵,且矩阵有复特征值 λ \lambda λ,而与特征值相对应的特征向量为 v v v这时候有个很漂亮的结论 A = C B C − 1 A=CBC^{-1} A=CBC1,其中
C = ( ∣ ∣ Re ⁡ ( v ) Im ⁡ ( v ) ∣ ∣ )  and  B = ( Re ⁡ ( λ ) Im ⁡ ( λ ) − Im ⁡ ( λ ) Re ⁡ ( λ ) ) (4) C=\left(\begin{array}{cc} | & | \\ \operatorname{Re}(v) & \operatorname{Im}(v) \\ | & | \end{array}\right) \quad \text { and } \quad B=\left(\begin{array}{cc} \operatorname{Re}(\lambda) & \operatorname{Im}(\lambda) \\ -\operatorname{Im}(\lambda) & \operatorname{Re}(\lambda) \end{array}\right)\tag{4} C=Re(v)Im(v) and B=(Re(λ)Im(λ)Im(λ)Re(λ))(4)
其中 B B B矩阵为rotation-scaling matrix。

为了证明矩阵 A A A的分解公式成立,我们首先证明 C C C是可逆的,即 Re ⁡ ( v ) \operatorname{Re}(v) Re(v) Im ⁡ ( v ) \operatorname{Im}(v) Im(v)是线性无关的。用反证法,假设 Re ⁡ ( v ) \operatorname{Re}(v) Re(v) Im ⁡ ( v ) \operatorname{Im}(v) Im(v)是线性相关的,则存在 x , y x,y x,y,使得, x Re ⁡ ( v ) + y Im ⁡ ( v ) = 0 x\operatorname{Re}(v)+y\operatorname{Im}(v)=0 xRe(v)+yIm(v)=0,则
( y + i x ) v = ( y + i x ) ( Re ⁡ ( v ) + i Im ⁡ ( v ) ) = y Re ⁡ ( v ) − x Im ⁡ ( v ) + ( x Re ⁡ ( v ) + y Im ⁡ ( v ) ) i = y Re ⁡ ( v ) − x Im ⁡ ( v ) (5) \begin{aligned} (y+i x) v &=(y+i x)(\operatorname{Re}(v)+i \operatorname{Im}(v)) \\ &=y \operatorname{Re}(v)-x \operatorname{Im}(v)+(x \operatorname{Re}(v)+y \operatorname{Im}(v)) i \\ &=y \operatorname{Re}(v)-x \operatorname{Im}(v) \end{aligned}\tag{5} (y+ix)v=(y+ix)(Re(v)+iIm(v))=yRe(v)xIm(v)+(xRe(v)+yIm(v))i=yRe(v)xIm(v)(5)
( y + i x ) v (y+i x) v (y+ix)v依然是属于特征值 λ \lambda λ的特征向量,而从式(5)可以得到 ( y + i x ) v (y+i x) v (y+ix)v是个实向量,而对于一个实矩阵的实特征向量对应的特征值一定是实的,但是和 λ \lambda λ是复特征根矛盾,因此可证 Re ⁡ ( v ) \operatorname{Re}(v) Re(v) Im ⁡ ( v ) \operatorname{Im}(v) Im(v)是线性无关的。

此外,我们假设复特征值 λ = a + b i \lambda=a+bi λ=a+bi,同时对应的特征向量为 v = ( x + y i z + w i ) v=\begin{pmatrix} x+yi \\ z+wi \end{pmatrix} v=(x+yiz+wi),则有,
A v = λ v = ( a + b i ) ( x + y i z + w i ) = ( ( a x − b y ) + ( a y + b x ) i ( a z − b w ) + ( a w + b z ) i ) = ( a x − b y a z − b w ) + i ( a y + b x a w + b z ) (6) \begin{aligned} A v=\lambda v &=(a+b i)\left(\begin{array}{c} x+y i \\ z+w i \end{array}\right) \\ &=\left(\begin{array}{c} (a x-b y)+(a y+b x) i \\ (a z-b w)+(a w+b z) i \end{array}\right) \\ &=\left(\begin{array}{c} a x-b y \\ a z-b w \end{array}\right)+i\left(\begin{array}{c} a y+b x \\ a w+b z \end{array}\right) \end{aligned}\tag{6} Av=λv=(a+bi)(x+yiz+wi)=((axby)+(ay+bx)i(azbw)+(aw+bz)i)=(axbyazbw)+i(ay+bxaw+bz)(6)
同时,
A ( ( x z ) + i ( y w ) ) = A ( x z ) + i A ( y w ) = A Re ⁡ ( v ) + i A Im ⁡ ( v ) (7) A\left(\left(\begin{array}{l} x \\ z \end{array}\right)+i\left(\begin{array}{l} y \\ w \end{array}\right)\right)=A\left(\begin{array}{l} x \\ z \end{array}\right)+i A\left(\begin{array}{l} y \\ w \end{array}\right)=A \operatorname{Re}(v)+i A \operatorname{Im}(v)\tag{7} A((xz)+i(yw))=A(xz)+iA(yw)=ARe(v)+iAIm(v)(7)

比较式(6)和(7),可以得到,
ARe ⁡ ( v ) = ( a x − b y a z − b w ) AIm ⁡ ( v ) = ( a y + b x a w + b z ) (8) \operatorname{ARe}(v)=\left(\begin{array}{l} a x-b y \\ a z-b w \end{array}\right) \quad \operatorname{AIm}(v)=\left(\begin{array}{l} a y+b x \\ a w+b z \end{array}\right)\tag{8} ARe(v)=(axbyazbw)AIm(v)=(ay+bxaw+bz)(8)

接下来我们计算 C B C − 1 Re ⁡ ( v ) C B C^{-1} \operatorname{Re}(v) CBC1Re(v),和 C B C − 1 Im ⁡ ( v ) C B C^{-1} \operatorname{Im}(v) CBC1Im(v),由(4)式可以马上得到 C e 1 = Re ⁡ ( v ) Ce_1=\operatorname{Re}(v) Ce1=Re(v), C e 2 = Im ⁡ ( v ) Ce_2=\operatorname{Im}(v) Ce2=Im(v)(自然基取对应的列),则有

C B C − 1 Re ⁡ ( v ) = C B e 1 = C ( a − b ) = a Re ⁡ ( v ) − b Im ⁡ ( v ) = a ( x z ) − b ( y w ) = ( a x − b y a z − b w ) = A Re ⁡ ( v ) C B C − 1 Im ⁡ ( v ) = C B e 2 = C ( b a ) = b Re ⁡ ( v ) + a Im ⁡ ( v ) = b ( x z ) + a ( y w ) = ( a y + b x a w + b z ) = A Im ⁡ ( v ) (9) \begin{aligned} C B C^{-1} \operatorname{Re}(v) &=C B e_{1}=C\left(\begin{array}{c} a \\ -b \end{array}\right)=a \operatorname{Re}(v)-b \operatorname{Im}(v) \\ &=a\left(\begin{array}{l} x \\ z \end{array}\right)-b\left(\begin{array}{l} y \\ w \end{array}\right)=\left(\begin{array}{l} a x-b y \\ a z-b w \end{array}\right)=A \operatorname{Re}(v) \\ C B C^{-1} \operatorname{Im}(v) &=C B e_{2}=C\left(\begin{array}{l} b \\ a \end{array}\right)=b \operatorname{Re}(v)+a \operatorname{Im}(v) \\ &=b\left(\begin{array}{l} x \\ z \end{array}\right)+a\left(\begin{array}{l} y \\ w \end{array}\right)=\left(\begin{array}{c} a y+b x \\ a w+b z \end{array}\right)=A \operatorname{Im}(v) \end{aligned}\tag{9} CBC1Re(v)CBC1Im(v)=CBe1=C(ab)=aRe(v)bIm(v)=a(xz)b(yw)=(axbyazbw)=ARe(v)=CBe2=C(ba)=bRe(v)+aIm(v)=b(xz)+a(yw)=(ay+bxaw+bz)=AIm(v)(9)
因为 Re ⁡ ( v ) \operatorname{Re}(v) Re(v) Im ⁡ ( v ) \operatorname{Im}(v) Im(v)的线性无关的,可以组成 R 2 \mathbb{R}^2 R2的基,对于任意的向量 w w w w = c Re ⁡ ( v ) + d Im ⁡ ( v ) w=c\operatorname{Re}(v)+d\operatorname{Im}(v) w=cRe(v)+dIm(v),则有,
A w = A ( cRe ⁡ ( v ) + d Im ⁡ ( v ) ) = cARe ⁡ ( v ) + d A Im ⁡ ( v ) = c C B C − 1 Re ⁡ ( v ) + d C B C − 1 Im ⁡ ( v ) = C B C − 1 ( c Re ⁡ ( v ) + d Im ⁡ ( v ) ) = C B C − 1 w (10) \begin{aligned} A w &=A(\operatorname{cRe}(v)+d \operatorname{Im}(v)) \\ &=\operatorname{cARe}(v)+d A \operatorname{Im}(v) \\ &=c C B C^{-1} \operatorname{Re}(v)+d C B C^{-1} \operatorname{Im}(v) \\ &=C B C^{-1}(c \operatorname{Re}(v)+d \operatorname{Im}(v)) \\ &=C B C^{-1} w \end{aligned}\tag{10} Aw=A(cRe(v)+dIm(v))=cARe(v)+dAIm(v)=cCBC1Re(v)+dCBC1Im(v)=CBC1(cRe(v)+dIm(v))=CBC1w(10)

因此 A = C B C − 1 A=C B C^{-1} A=CBC1
对于 A A A的带有rotation-scaling matrix的分解,我们可以这么理解, A A A中含有旋转和比例变换,矩阵 C C C提供了变量代换,如 x = C u x=Cu x=Cu A A A的作用相当于先将 x x x代换为 u u u,然后在 C C C所形成的基下利用 B B B矩阵进行旋转和缩放,旋转产生一个椭圆,然后将 u u u再变量代换回 x x x。注意,旋转是在 C C C所形成的基下,即顺着 Re ⁡ ( v ) \operatorname{Re}(v) Re(v) Im ⁡ ( v ) \operatorname{Im}(v) Im(v)所形成的基旋转

对于 n × n n\times n n×n矩阵,都有类似上述 2 × 2 2\times 2 2×2矩阵的分解形式,下面以 3 × 3 3\times 3 3×3为列,如果矩阵 A A A有一个实的特征值 λ 2 \lambda_{2} λ2,一个复特征值 λ 1 \lambda_{1} λ1,则 λ 1 ‾ \overline{\lambda_{1}} λ1为另外一个复特征值, λ 2 \lambda_{2} λ2对应的实特征向量为 v 2 v_2 v2 λ 1 \lambda_{1} λ1对应的复特征向量为 v 1 v_1 v1,将 A A A分解为 A = C B C − 1 A=C B C^{-1} A=CBC1,
C = ( ∣ ∣ ∣ Re ⁡ ( v 1 ) Im ⁡ ( v 1 ) v 2 ∣ ∣ ∣ ) B = ( Re ⁡ ( λ 1 ) Im ⁡ ( λ 1 ) 0 − Im ⁡ ( λ 1 ) Re ⁡ ( λ 1 ) 0 0 0 λ 2 ) (11) C=\left(\begin{array}{ccc} | & | & | \\ \operatorname{Re}\left(v_{1}\right) & \operatorname{Im}\left(v_{1}\right) & v_{2} \\ | & | & | \end{array}\right) \quad B=\left(\begin{array}{ccc} \operatorname{Re}\left(\lambda_{1}\right) & \operatorname{Im}\left(\lambda_{1}\right) & 0 \\ -\operatorname{Im}\left(\lambda_{1}\right) & \operatorname{Re}\left(\lambda_{1}\right) & 0 \\ \hline 0 & 0 & \lambda_{2} \end{array}\right)\tag{11} C=Re(v1)Im(v1)v2B=Re(λ1)Im(λ1)0Im(λ1)Re(λ1)000λ2(11)
对于上述矩阵 A A A,在 R 3 \mathbb{R}^{3} R3中存在某个平面 A A A对平面的作用是旋转和缩放,该平面在 A A A的作用下是不变的。
举一个例子,例如,
A = [ 0.8 − 0.6 0 0.6 0.8 0 0 0 1.07 ] A=\begin{bmatrix} 0.8 & -0.6 & 0 \\ 0.6 & 0.8 & 0 \\ 0 & 0 & 1.07 \end{bmatrix} A=0.80.600.60.80001.07
上述矩阵 A A A与式(11)中的矩阵形式相同,如下图所示,对于 x 1 x 2 x_1x_2 x1x2平面(第三坐标为0)的任一向量 w 0 w_0 w0 A A A旋转到该平面的另外一个位置上,不在该平面的任一向量 x 0 x_0 x0的第三坐标乘1.07。下图显示了 w 0 = ( 2 , 0 , 0 ) w_0=(2,0,0) w0=(2,0,0) x 0 = ( 2 , 0 , 1 ) x_0=(2,0,1) x0=(2,0,1) A A A作用的迭代结果, w 0 w_0 w0 x 1 x 2 x_1x_2 x1x2平面旋转,而 x 0 x_0 x0乘1.07后在旋转的同时也在盘旋上升
坐标变换(8)—复特征值与旋转_第2张图片

你可能感兴趣的:(坐标变换(8)—复特征值与旋转)