伪逆总结

参考书籍

方保镕, 周继东, 李医民. 矩阵论.第2版[M]. 清华大学出版社, 2013.7714
Convex optimization[M]. 2013.

  1. 加号逆的定义
    矩阵 A ∈ R m × n \textbf{A} \in R^{m \times n} ARm×n,若存在 G ∈ R m × n \textbf{G} \in R^{m \times n} GRm×n满足:
    (1) AGA = A \textbf{AGA}=\textbf{A} AGA=A;
    (2) GAG = G \textbf{GAG}=\textbf{G} GAG=G;
    (3) ( AG ) T = AG (\textbf{AG})^T=\textbf{AG} (AG)T=AG;
    (4) ( GA ) T = GA (\textbf{GA})^T=\textbf{GA} (GA)T=GA;
    则称 G G G A A A加号逆,或伪逆,或摩尔-彭诺斯逆,记为 A + \textbf{A}^+ A+

  2. 加号逆的求法
    任何矩阵的加号逆具有唯一性。假设 A ∈ R m × n A \in \textbf{R}^{m \times n} ARm×n rank   A = r \textbf{rank }A=r rank A=r。那么 A A A可以因式分解为
    A = U Σ V T A=U\Sigma V^T A=UΣVT
    其中 U ∈ R m × r U\in \textbf{R}^{m\times r} URm×r满足 U T U = I U^TU=I UTU=I V ∈ R n × r V\in \textbf{R}^{n\times r} VRn×r满足 V T V = I V^TV=I VTV=I,而 Σ = diag ( σ 1 , σ 2 , ⋯   , σ r ) \Sigma =\textbf{diag}(\sigma_1,\sigma_2,\cdots,\sigma_r) Σ=diag(σ1,σ2,,σr)满足
    σ 1 ≥ σ 2 ≥ ⋯ ≥ σ r \sigma_1 \ge \sigma_2 \ge \cdots \ge \sigma_r σ1σ2σr
    U U U的列向量称为 A A A左奇异向量 V V V的列向量称为 A A A右奇异向量,而 σ i \sigma_i σi则称为奇异值。奇异值分解可以写成
    A = ∑ i = 1 r σ i u i v i T , A=\sum\limits_{i=1}^{r}\sigma_iu_iv_i^T, A=i=1rσiuiviT
    其中 u i ∈ R m u_i\in\textbf{R}^m uiRm是左奇异向量, v i ∈ R n v_i\in\textbf{R}^n viRn是右奇异向量。

矩阵 A A A的伪逆如下:
A + = V Σ − 1 U T ∈ R n × m . A^+=V\Sigma^{-1}U^T\in \textbf{R}^{n\times m}. A+=VΣ1UTRn×m.
根据伪逆的定义验证通过。
可将 U U U扩展成 R m × m \textbf{R}^{m\times m} Rm×m酉矩阵,将 V V V扩展成 R n × n \textbf{R}^{n\times n} Rn×n酉矩阵, A A A的奇异分解可以写成:
A = U m × m [ Σ r × r 0 0 0 ] m × n V n × n T A=U_{m\times m} \left[ \begin{matrix} \Sigma_{r\times r}&0\\ 0&0\\ \end{matrix} \right]_{m\times n} V_{n\times n}^T A=Um×m[Σr×r000]m×nVn×nT
矩阵 A A A的伪逆如下:
A + = V n × n [ Σ r × r − 1 0 0 0 ] n × m U m × m T A^+=V_{n\times n} \left[ \begin{matrix} \Sigma_{r\times r}^{-1}&0\\ 0&0\\ \end{matrix} \right]_{n\times m} U_{m\times m}^T A+=Vn×n[Σr×r1000]n×mUm×mT
根据伪逆的定义验证通过。

  1. 等价表达式
    A + = lim ⁡ ϵ → 0 ( A T A + ϵ I ) − 1 A T = lim ⁡ ϵ → 0 A T ( A A T + ϵ I ) − 1 A^+ = \lim_{\epsilon \to 0}(A^TA+\epsilon I)^{-1}A^T= \lim_{\epsilon \to 0}A^T(AA^T+\epsilon I)^{-1} A+=ϵ0lim(ATA+ϵI)1AT=ϵ0limAT(AAT+ϵI)1
    证明:
    lim ⁡ ϵ → 0 ( A T A + ϵ I ) − 1 A T = lim ⁡ ϵ → 0 ( V n × n [ Σ r × r 2 0 0 0 ] n × n V n × n T + ϵ I ) − 1 A T = lim ⁡ ϵ → 0 ( V [ σ 1 2 + ϵ ⋱ σ r 2 + ϵ ϵ I ( n − r ) × ( n − r ) ] n × n V T ) − 1 A T = lim ⁡ ϵ → 0 V [ σ 1 2 + ϵ ⋱ σ r 2 + ϵ ϵ I ] n × n − 1 V T A T = lim ⁡ ϵ → 0 V [ 1 σ 1 2 + ϵ ⋱ 1 σ r 2 + ϵ 1 ϵ I ] n × n V T V [ Σ r × r 0 0 0 ] n × n U T = lim ⁡ ϵ → 0 V [ σ 1 σ 1 2 + ϵ ⋱ σ r σ r 2 + ϵ 0 ] U T = V [ 1 σ 1 ⋱ 1 σ r 0 ] U T = V [ Σ r × r − 1 0 0 0 ] U T \begin{aligned} \lim_{\epsilon \to 0}(A^TA+\epsilon I)^{-1}A^T &= \lim_{\epsilon \to 0}( V_{n\times n} \left[ \begin{matrix} \Sigma_{r\times r}^2&0\\ 0&0\\ \end{matrix} \right]_{n\times n} V_{n\times n}^T+\epsilon I)^{-1}A^T \\ &= \lim_{\epsilon \to 0}( V \left[ \begin{matrix} \sigma_1^2+\epsilon&&&\\ &\ddots\\ &&\sigma_r^2+\epsilon&\\ &&&\epsilon I_{(n-r)\times (n-r)}\\ \end{matrix} \right]_{n\times n} V^T)^{-1}A^T \\ &= \lim_{\epsilon \to 0} V \left[ \begin{matrix} \sigma_1^2+\epsilon&&&\\ &\ddots\\ &&\sigma_r^2+\epsilon&\\ &&&\epsilon I\\ \end{matrix} \right]_{n\times n}^{-1} V^TA^T \\ &= \lim_{\epsilon \to 0} V \left[ \begin{matrix} \frac{1}{\sigma_1^2+\epsilon}&&&\\ &\ddots\\ &&\frac{1}{\sigma_r^2+\epsilon}&\\ &&&\frac{1}{\epsilon} I\\ \end{matrix} \right]_{n\times n} V^T V \left[ \begin{matrix} \Sigma_{r\times r}&0\\ 0&0\\ \end{matrix} \right]_{n\times n} U^T \\ &= \lim_{\epsilon \to 0} V \left[ \begin{matrix} \frac{\sigma_1}{\sigma_1^2+\epsilon}&&&\\ &\ddots\\ &&\frac{\sigma_r}{\sigma_r^2+\epsilon}&\\ &&&0\\ \end{matrix} \right] U^T \\ &= V \left[ \begin{matrix} \frac{1}{\sigma_1}&&&\\ &\ddots\\ &&\frac{1}{\sigma_r}&\\ &&&0\\ \end{matrix} \right] U^T \\ &= V \left[ \begin{matrix} \Sigma_{r\times r}^{-1}&0\\ 0&0\\ \end{matrix} \right] U^T \end{aligned} ϵ0lim(ATA+ϵI)1AT=ϵ0lim(Vn×n[Σr×r2000]n×nVn×nT+ϵI)1AT=ϵ0lim(Vσ12+ϵσr2+ϵϵI(nr)×(nr)n×nVT)1AT=ϵ0limVσ12+ϵσr2+ϵϵIn×n1VTAT=ϵ0limVσ12+ϵ1σr2+ϵ1ϵ1In×nVTV[Σr×r000]n×nUT=ϵ0limVσ12+ϵσ1σr2+ϵσr0UT=Vσ11σr10UT=V[Σr×r1000]UT
    得证。

(2)
lim ⁡ ϵ → 0 A T ( A A T + ϵ I ) − 1 = lim ⁡ ϵ → 0 A T ( U m × m [ Σ r × r 2 0 0 0 ] m × m U m × m T + ϵ I ) − 1 = lim ⁡ ϵ → 0 A T ( U m × m [ σ 1 2 + ϵ ⋱ σ r 2 + ϵ ϵ I ( m − r ) × ( m − r ) ] m × m U m × m T ) − 1 = lim ⁡ ϵ → 0 A T U [ σ 1 2 + ϵ ⋱ σ r 2 + ϵ ϵ I ] − 1 U T = lim ⁡ ϵ → 0 V [ Σ r × r 0 0 0 ] U T U [ 1 σ 1 2 + ϵ ⋱ 1 σ r 2 + ϵ 1 ϵ I ] U T = lim ⁡ ϵ → 0 V [ σ 1 σ 1 2 + ϵ ⋱ σ r σ r 2 + ϵ 0 ] n × n U T = V [ 1 σ 1 ⋱ 1 σ r 0 ] U T = V [ Σ r × r − 1 0 0 0 ] U T \begin{aligned} \lim_{\epsilon \to 0}A^T(AA^T+\epsilon I)^{-1} &= \lim_{\epsilon \to 0}A^T( U_{m\times m} \left[ \begin{matrix} \Sigma_{r\times r}^2&0\\ 0&0\\ \end{matrix} \right]_{m\times m} U_{m\times m}^T+\epsilon I)^{-1} \\ &= \lim_{\epsilon \to 0}A^T( U_{m\times m} \left[ \begin{matrix} \sigma_1^2+\epsilon&&&\\ &\ddots\\ &&\sigma_r^2+\epsilon&\\ &&&\epsilon I_{(m-r)\times (m-r)}\\ \end{matrix} \right]_{m\times m} U_{m\times m}^T)^{-1} \\ &= \lim_{\epsilon \to 0}A^T U \left[ \begin{matrix} \sigma_1^2+\epsilon&&& \\ &\ddots\\ &&\sigma_r^2+\epsilon& \\ &&&\epsilon I \\ \end{matrix} \right]^{-1} U^T \\ &= \lim_{\epsilon \to 0} V \left[ \begin{matrix} \Sigma_{r\times r}&0\\ 0&0\\ \end{matrix} \right] U^T U \left[ \begin{matrix} \frac{1}{\sigma_1^2+\epsilon}&&&\\ &\ddots\\ &&\frac{1}{\sigma_r^2+\epsilon}&\\ &&&\frac{1}{\epsilon} I \\ \end{matrix} \right] U^T \\ &= \lim_{\epsilon \to 0} V \left[ \begin{matrix} \frac{\sigma_1}{\sigma_1^2+\epsilon}&&& \\ &\ddots\\ &&\frac{\sigma_r}{\sigma_r^2+\epsilon}& \\ &&&0 \\ \end{matrix} \right]_{n\times n} U^T \\ &= V \left[ \begin{matrix} \frac{1}{\sigma_1}&&&\\ &\ddots\\ &&\frac{1}{\sigma_r}&\\ &&&0\\ \end{matrix} \right] U^T \\ &= V \left[ \begin{matrix} \Sigma_{r\times r}^{-1}&0\\ 0&0\\ \end{matrix} \right] U^T \end{aligned} ϵ0limAT(AAT+ϵI)1=ϵ0limAT(Um×m[Σr×r2000]m×mUm×mT+ϵI)1=ϵ0limAT(Um×mσ12+ϵσr2+ϵϵI(mr)×(mr)m×mUm×mT)1=ϵ0limATUσ12+ϵσr2+ϵϵI1UT=ϵ0limV[Σr×r000]UTUσ12+ϵ1σr2+ϵ1ϵ1IUT=ϵ0limVσ12+ϵσ1σr2+ϵσr0n×nUT=Vσ11σr10UT=V[Σr×r1000]UT

你可能感兴趣的:(矩阵论,课程学习总结)