AI笔记: 数学基础之正交矩阵与矩阵的QR分解

正交矩阵

  • 若n阶方阵A满足 A T A = E A^TA = E ATA=E, 则称A为正交矩阵, 简称正交阵 (复数域上称为酉矩阵)
    • A是正交阵的充要条件:A的列(行)向量都是单位向量,且两两正交。
  • 若A为正交矩阵,x为向量,则Ax称为正交变换
    • 正交变换不改变向量的长度 y = A x , y T y = ( A x ) T A x = x T A T A x = x T E x = x T x y=Ax, y^Ty = (Ax)^TAx = x^TA^TAx = x^TEx = x^Tx y=Ax,yTy=(Ax)TAx=xTATAx=xTEx=xTx
  • 正交矩阵的性质
    • 若A为正交矩阵,则逆矩阵 A − 1 A^{-1} A1也为正交矩阵
    • 若P、Q为正交矩阵,那么 P ∗ Q P*Q PQ也为正交矩阵

QR分解(正交三角分解)

  • 对于m*n的列满秩矩阵A, 必有, A m ∗ n = Q m ∗ m ⋅ R m ∗ n A_{m*n} = Q_{m*m} · R_{m*n} Amn=QmmRmn
  • 其中Q为正交矩阵,R为非奇异上三角矩阵,当要求R的对角线元素为正的时候,该分解唯一。
  • 该分解叫做QR分解,常用语求解A的特征值、A的逆,最小二乘等问题
  • QR分解是将矩阵分解为一个正交矩阵与上三角矩阵的乘积
AI笔记: 数学基础之正交矩阵与矩阵的QR分解_第1张图片
备注:图片托管于github,请确保网络的可访问性

  • 这其中,Q为正交矩阵, Q T Q = l Q^TQ = l QTQ=l, R为上三角矩阵
  • 实际中,QR分解经常被用来解线性最小二乘问题。

施密特正交化过程

  • 把一组线性无关向量组化为规范正交向量组,继而得到正交阵
  • η 1 = β 1 ∣ ∣ β 1 ∣ ∣ , η 2 = β 2 ∣ ∣ β 2 , ⋯   , η r = β r ∣ ∣ β r ∣ ∣ \eta_1 = \frac{\beta_1}{||\beta_1||}, \eta_2 = \frac{\beta_2}{||\beta_2}, \cdots, \eta_r = \frac{\beta_r}{||\beta_r||} η1=β1β1,η2=β2β2,,ηr=βrβr 是与 α 1 , α 2 , . . . , α r \alpha_1, \alpha_2, ..., \alpha_r α1,α2,...,αr等价的规范(标准)正交组。
  • α 1 , α 2 , . . . , α r \alpha_1, \alpha_2, ..., \alpha_r α1,α2,...,αr 线性无关, 令 β 1 = α 1 , β 2 = α 2 − [ β 1 , α 2 ] [ β 1 , β 1 ] β 1 , β 3 = α 3 − [ β 1 , α 3 ] β 1 , β 2 β 1 − [ β 2 , α 3 ] [ β 2 , β 2 ] β 2 ⋯ ⋯ \beta_1 = \alpha_1, \beta_2 = \alpha_2 - \frac{[\beta_1, \alpha_2]}{[\beta_1, \beta_1]} \beta_1, \beta_3 = \alpha_3 - \frac{[\beta_1, \alpha_3]}{\beta_1, \beta_2} \beta_1 - \frac{[\beta_2, \alpha_3]}{[\beta_2, \beta_2]} \beta_2 \cdots \cdots β1=α1,β2=α2[β1,β1][β1,α2]β1,β3=α3β1,β2[β1,α3]β1[β2,β2][β2,α3]β2
  • β r = α r − [ β 1 , α r ] [ β 1 , β 1 ] β 1 − [ β 2 , α r ] [ β 2 , β 2 ] β 2 − ⋯ − [ β r − 1 α r ] [ β r − 1 , β r − 1 ] β r − 1 \beta_r = \alpha_r - \frac{[\beta_1, \alpha_r]}{[\beta_1, \beta_1]}\beta_1 - \frac{[\beta_2, \alpha_r]}{[\beta_2, \beta_2]} \beta_2 - \cdots - \frac{[\beta_{r-1} \alpha_r]}{[\beta_{r-1}, \beta_{r-1}]} \beta_{r-1} βr=αr[β1,β1][β1,αr]β1[β2,β2][β2,αr]β2[βr1,βr1][βr1αr]βr1
  • β 1 , β 2 , ⋯   , β r \beta_1, \beta_2, \cdots, \beta_r β1,β2,,βr 两两正交,且与 α 1 , α 2 , ⋯   , α r \alpha_1, \alpha_2, \cdots, \alpha_r α1,α2,,αr等价

例1

  • 求矩阵 A = ( 1 1 − 1 1 0 0 0 1 0 0 0 1 ) A=\left (\begin{array}{cccc}1 & 1 & -1 \\1 & 0 & 0 \\0 & 1 & 0 \\0 & 0 & 1\end{array} \right ) A=110010101001的QR(正交三角)分解
  • 分析
    • 容易判断出 A ∈ C 3 4 × 3 A \in C_3^{4×3} AC34×3 即A是一个列满秩矩阵
    • A = [ α 1 , α 2 , α 3 ] A = [\alpha_1, \alpha_2, \alpha_3] A=[α1,α2,α3]的三个列向量施密特正交化先得到一个规范正交向量组
    • β 1 = α 1 = [ 1    1    0    0 ] T \beta_1 = \alpha_1 = [1 \ \ 1 \ \ 0 \ \ 0]^T β1=α1=[1  1  0  0]T
    • β 2 = α 2 − ( α 2 , β 1 ) β 1 , β 1 β 1 = α 2 − 1 2 β 1 = [ 1 2    − 1 2    1    0 ] T \beta_2 = \alpha_2 - \frac{(\alpha_2, \beta_1)}{\beta_1, \beta_1} \beta_1 = \alpha_2 - \frac{1}{2} \beta_1 = [\frac{1}{2} \ \ \frac{-1}{2} \ \ 1 \ \ 0]^T β2=α2β1,β1(α2,β1)β1=α221β1=[21  21  1  0]T
    • β 3 = α 3 − ( α 3 , β 1 ) β 1 , β 1 β 1 − ( α 3 , β 2 ) β 2 , β 2 β 2 = α 3 + 1 2 β 1 + 1 3 β 2 = [ − 1 3    1 3    1 3    1 ] T \beta_3 = \alpha_3 - \frac{(\alpha_3, \beta_1)}{\beta_1, \beta_1} \beta_1 - \frac{(\alpha_3, \beta_2)}{\beta_2, \beta_2} \beta_2 = \alpha_3 + \frac{1}{2} \beta_1 + \frac{1}{3} \beta_2 = [\frac{-1}{3} \ \ \frac{1}{3} \ \ \frac{1}{3} \ \ 1]^T β3=α3β1,β1(α3,β1)β1β2,β2(α3,β2)β2=α3+21β1+31β2=[31  31  31  1]T
    • 再将其单位化,得到一组标准正交向量组
      • η 1 = 1 ∣ ∣ β 1 ∣ ∣ β 1 = [ 2 2    2 2    0    0 ] T \eta_1 = \frac{1}{||\beta_1||} \beta_1 = [\frac{\sqrt{2}}{2} \ \ \frac{\sqrt{2}}{2} \ \ 0 \ \ 0]^T η1=β11β1=[22   22   0  0]T
      • η 2 = 1 ∣ ∣ β 2 ∣ ∣ β 2 = [ 6 6    − 6 3    6 3    0 ] T \eta_2 = \frac{1}{||\beta_2||} \beta_2 = [\frac{\sqrt{6}}{6} \ \ -\frac{\sqrt{6}}{3} \ \ \frac{\sqrt{6}}{3} \ \ 0]^T η2=β21β2=[66   36   36   0]T
      • η 3 = 1 ∣ ∣ β 3 ∣ ∣ β 3 = [ − 3 6    3 6    3 6    3 2 ] T \eta_3 = \frac{1}{||\beta_3||} \beta_3 = [-\frac{\sqrt{3}}{6} \ \ \frac{\sqrt{3}}{6} \ \ \frac{\sqrt{3}}{6} \ \ \frac{\sqrt{3}}{2}]^T η3=β31β3=[63   63   63   23 ]T
    • ⇒ Q ( η 1 , η 2 , η 3 ) = [ 2 2 6 6 − 3 6 2 2 − 6 6 3 6 0 6 3 3 6 0 0 3 2 ] \Rightarrow Q(\eta_1, \eta_2, \eta_3) = \left [\begin{array}{cccc}\frac{\sqrt{2}}{2} & \frac{\sqrt{6}}{6} & -\frac{\sqrt{3}}{6} \\\frac{\sqrt{2}}{2} & -\frac{\sqrt{6}}{6} & \frac{\sqrt{3}}{6} \\0 & \frac{\sqrt{6}}{3} & \frac{\sqrt{3}}{6} \\0 & 0 & \frac{\sqrt{3}}{2}\end{array} \right ] Q(η1,η2,η3)=22 22 0066 66 36 063 63 63 23
    • β 1 = α 1 = [ 1    1    0    0 ] T \beta_1 = \alpha_1 = [1 \ \ 1 \ \ 0 \ \ 0]^T β1=α1=[1  1  0  0]T
    • β 2 = α 2 − ( α 2 , β 1 ) ( β 1 , β 1 ) β 1 = α 2 − 1 2 β 1 = [ 1 2    − 1 2    1    0 ] T \beta_2 = \alpha_2 - \frac{(\alpha_2, \beta_1)}{(\beta_1, \beta_1)} \beta_1 = \alpha_2 - \frac{1}{2} \beta_1 = [\frac{1}{2} \ \ \frac{-1}{2} \ \ 1 \ \ 0]^T β2=α2(β1,β1)(α2,β1)β1=α221β1=[21  21  1  0]T
    • β 3 = α 3 − ( α 3 , β 1 ) β 1 , β 1 β 1 − ( α 3 , β 2 ) β 2 , β 2 β 2 = α 3 + 1 2 β 1 + 1 3 β 2 = [ − 1 3    1 3    1 3    1 ] T \beta_3 = \alpha_3 - \frac{(\alpha_3, \beta_1)}{\beta_1, \beta_1}\beta_1 - \frac{(\alpha_3, \beta_2)}{\beta_2, \beta_2} \beta_2 = \alpha_3 + \frac{1}{2}\beta_1 + \frac{1}{3}\beta_2 = [\frac{-1}{3} \ \ \frac{1}{3} \ \ \frac{1}{3} \ \ 1]^T β3=α3β1,β1(α3,β1)β1β2,β2(α3,β2)β2=α3+21β1+31β2=[31  31  31  1]T
    • ⇒ \Rightarrow
      • α 1 = β 1 \alpha_1 = \beta_1 α1=β1
      • α 2 = 1 2 β 1 + β 2 \alpha_2 = \frac{1}{2}\beta_1 + \beta_2 α2=21β1+β2
      • α 3 = − 1 2 β 1 − 1 3 β 2 + β 3 \alpha_3 = -\frac{1}{2}\beta_1 - \frac{1}{3}\beta_2 + \beta_3 α3=21β131β2+β3
    • 再将其单位化,得到一组标准正交向量组
      • β 1 = ∣ ∣ β 1 ∣ ∣ η 1 β 2 = ∣ ∣ β 2 ∣ ∣ η 2 β 3 = ∣ ∣ β 3 ∣ ∣ η 3 \left.\begin{array}{cccc}\beta_1 = ||\beta_1|| \eta_1 \\ \beta_2 = ||\beta_2|| \eta_2 \\ \beta_3 = ||\beta_3|| \eta_3\end{array} \right. β1=β1η1β2=β2η2β3=β3η3 α 1 = β 1 α 2 = 1 2 β 1 + β 2 α 3 = − 1 2 β 1 − 1 3 β 2 + β 3 \left. \begin{array}{cccc} \alpha_1 = \beta_1 \\ \alpha_2 = \frac{1}{2}\beta_1 + \beta_2 \\ \alpha_3 = -\frac{1}{2}\beta_1 - \frac{1}{3}\beta_2 + \beta_3 \end{array} \right. α1=β1α2=21β1+β2α3=21β131β2+β3
      • ⇒ α 1 = 2 η 1 α 2 = 6 2 η 2 + 2 2 η 1 α 3 = 2 3 3 η 3 − 6 6 η 2 − 2 2 η 1 ⇒ R = [ 2 2 2 − 2 2 0 6 2 6 6 0 0 2 3 3 ] \Rightarrow \left.\begin{array}{cccc}\alpha_1 = \sqrt{2} \eta_1 \\\alpha_2 = \frac{\sqrt{6}}{2} \eta_2 + \frac{\sqrt{2}}{2} \eta_1 \\\alpha_3 = \frac{2\sqrt{3}}{3} \eta_3 - \frac{\sqrt{6}}{6} \eta_2 - \frac{\sqrt{2}}{2} \eta_1 \\ \end{array} \right. \Rightarrow R = \left [ \begin{array}{cccc} \sqrt{2} & \frac{\sqrt{2}}{2} & -\frac{\sqrt{2}}{2} \\ 0 & \frac{\sqrt{6}}{2} & \frac{\sqrt{6}}{6} \\ 0 & 0 & \frac{2\sqrt{3}}{3} \end{array} \right ] α1=2 η1α2=26 η2+22 η1α3=323 η366 η222 η1R=2 0022 26 022 66 323
      • 故得到A矩阵的QR分解如下:
      • A = ( α 1    α 2    α 3 ) = Q R = [ 2 2 6 6 − 3 6 2 2 − 6 6 3 6 0 6 3 3 6 0 0 3 2 ] [ 2 2 2 − 2 2 0 6 2 6 6 0 6 3 3 6 0 0 2 3 3 ] A = (\alpha_1 \ \ \alpha_2 \ \ \alpha_3) = QR =\left [\begin{array}{cccc}\frac{\sqrt{2}}{2} & \frac{\sqrt{6}}{6} & -\frac{\sqrt{3}}{6} \\\frac{\sqrt{2}}{2} & -\frac{\sqrt{6}}{6} &\frac{\sqrt{3}}{6} \\ 0 & \frac{\sqrt{6}}{3} & \frac{\sqrt{3}}{6} \\ 0 & 0 & \frac{\sqrt{3}}{2} \end{array} \right ] \left [ \begin{array}{cccc} \sqrt{2} & \frac{\sqrt{2}}{2} & -\frac{\sqrt{2}}{2} \\ 0 & \frac{\sqrt{6}}{2} & \frac{\sqrt{6}}{6} \\ 0 & \frac{\sqrt{6}}{3} & \frac{\sqrt{3}}{6} \\ 0 & 0 & \frac{2\sqrt{3}}{3} \end{array} \right ] A=(α1  α2  α3)=QR=22 22 0066 66 36 063 63 63 23 2 00022 26 36 022 66 63 323
      • 简写为: A 4 × 3 = Q R = Q 4 × 3 R 3 × 3 A_{4×3} = QR = Q_{4×3} R_{3×3} A4×3=QR=Q4×3R3×3

你可能感兴趣的:(AI,Mathematics)