冯诺依曼(Von Neumann)迹不等式证明

文章目录

  • 冯诺依曼迹不等式的定义
  • 冯诺依曼迹不等式的证明
    • 柯西中值定理
    • 数学归纳法
    • 冯诺依曼迹不等式的证明

冯诺依曼迹不等式的定义

假设 A ∈ S + + n \mathbf{A} \in {\mathbb{S}_{++}^n} AS++n B ∈ S + + n \mathbf{B} \in {\mathbb{S}_{++}^n} BS++n,其特征值分别为:
λ 1 ( A ) ≥ λ 2 ( A ) ≥ ⋯ ≥ λ n ( A ) > 0 , λ 1 ( B ) ≥ λ 2 ( B ) ≥ ⋯ ≥ λ n ( B ) > 0 \lambda_1(\mathbf{A}) \geq \lambda_2(\mathbf{A}) \geq \cdots \geq \lambda_n(\mathbf{A}) >0,\lambda_1(\mathbf{B}) \geq \lambda_2(\mathbf{B}) \geq \cdots \geq \lambda_n(\mathbf{B})>0 λ1(A)λ2(A)λn(A)>0,λ1(B)λ2(B)λn(B)>0
有如不等式成立:
∑ i = 1 n λ i ( A ) λ n − i + 1 ( B ) ≤ t r ( A B ) = ∑ i = 1 n λ i ( A B ) ≤ ∑ i = 1 n λ i ( A ) λ i ( B ) \sum_{i=1}^n \lambda_i(\mathbf{A})\lambda_{n-i+1}(\mathbf{B})\leq\mathrm{tr}(\mathbf{AB})=\sum_{i=1}^n \lambda_i(\mathbf{A B})\leq \sum_{i=1}^n \lambda_i(\mathbf{A})\lambda_{i}(\mathbf{B}) i=1nλi(A)λni+1(B)tr(AB)=i=1nλi(AB)i=1nλi(A)λi(B)
该不等式称为:冯诺依曼不等式

本文主要证明冯诺依曼迹不等式的左边部分,右边不等式较为容易理解。

冯诺依曼迹不等式的证明

柯西中值定理

在证明之前需要引入一个重要的柯西中值定理(Cauchy Interlacing Theorem):

λ i + 1 ( C ) ≤ λ i ( C ~ ) ≤ λ i ( C ) \lambda_{i+1}(\mathbf{C}) \leq \lambda_i(\widetilde{\mathbf{C}}) \leq \lambda_i(\mathbf{C}) λi+1(C)λi(C )λi(C)
其中 C ~ ∈ S + + n − 1 \widetilde{\mathbf{C}} \in {\mathbb{S}_{++}^{n-1}} C S++n1 C ∈ S + + n \mathbf{C} \in {\mathbb{S}_{++}^n} CS++n n − 1 n-1 n1维度的主子矩阵(Principal submatrix)。

证:
存在一个正交矩阵 P ∈ R ( n ) × ( n − 1 ) \mathbf{P} \in \mathbb{R}^{(n) \times(n-1)} PR(n)×(n1)使得 P T C P = C ~ \mathbf{P}^{\mathrm{T}} \mathbf{C P}=\widetilde{\mathbf{C}} PTCP=C ,对于 i ≤ n − 1 i\leq n-1 in1基于Courant-Fischer theorem,有:
λ i ( C ~ ) = max ⁡ S i ⊆ R n − 1 min ⁡ x ∈ S i , ∥ x ∥ 2 = 1 x T C ~ x = max ⁡ S i ⊆ R n − 1 min ⁡ x ∈ S i , ∥ x ∥ 2 = 1 ( P x ) T C ( P x ) ≤ max ⁡ P i ∈ R n min ⁡ y ∈ P i , ∥ y ∥ 2 = 1 y T C y = λ i ( C ) \lambda_i(\widetilde{\mathbf{C}})=\max _{\mathcal{S}_i \subseteq \mathbb{R}^{n-1}} \min _{\mathbf{x} \in \mathcal{S}_i,\|\mathbf{x}\|_2=1} \mathbf{x}^{\mathrm{T}} \widetilde{\mathbf{C}} \mathbf{x}=\max _{\mathcal{S}_i \subseteq \mathbb{R}^{n-1}} \min _{\mathbf{x} \in \mathcal{S}_i,\|\mathbf{x}\|_2=1}(\mathbf{P x})^{\mathrm{T}} \mathbf{C}(\mathbf{P x})\leq \max _{\mathcal{P}_i \in \mathbb{R}^n} \min _{\mathbf{y} \in \mathcal{P}_i,\|\boldsymbol{y}\|_2=1} \mathbf{y}^{\mathrm{T}} \mathbf{C y}=\lambda_i(\mathbf{C}) λi(C )=SiRn1maxxSi,x2=1minxTC x=SiRn1maxxSi,x2=1min(Px)TC(Px)PiRnmaxyPi,y2=1minyTCy=λi(C)

同时有:
λ i ( C ~ ) = min ⁡ S n − i ∈ R n − 1 max ⁡ x ∈ S n − i , ∥ x ∥ 2 = 1 x T C ~ x = min ⁡ S n − i ⊆ R n − 1 max ⁡ x ∈ S n − i , ∥ x ∥ 2 = 1 ( P x ) T C ( P x ) ≥ min ⁡ P n − i ∈ R n max ⁡ y ∈ P n − i , ∥ y ∥ 2 = 1 y T C y = λ i + 1 ( C ) \lambda_i(\widetilde{\mathbf{C}})=\min _{\mathcal{S}_{n-i} \in \mathbb{R}^{n-1}} \max _{\mathbf{x} \in \mathcal{S}_{n-i},\|\mathbf{x}\|_2=1} \mathbf{x}^{\mathrm{T}} \widetilde{\mathbf{C}} \mathbf{x}=\min _{\mathcal{S}_{n-i} \subseteq \mathbb{R}^{n-1}} \max _{\mathbf{x} \in \mathcal{S}_{n-i},\|\mathbf{x}\|_2=1}(\mathbf{P x})^{\mathrm{T}} \mathbf{C}(\mathbf{P x})\geq\min _{\mathcal{P}_{n-i} \in \mathbb{R}^n} \max _{\mathbf{y} \in \mathcal{P}_{n-i},\|y\|_2=1} \mathbf{y}^{\mathrm{T}} \mathbf{C y}=\lambda_{i+1}(\mathbf{C}) λi(C )=SniRn1minxSni,x2=1maxxTC x=SniRn1minxSni,x2=1max(Px)TC(Px)PniRnminyPni,y2=1maxyTCy=λi+1(C)

综上:
λ i + 1 ( C ) ≤ λ i ( C ~ ) ≤ λ i ( C ) \lambda_{i+1}(\mathbf{C}) \leq \lambda_i(\widetilde{\mathbf{C}}) \leq \lambda_i(\mathbf{C}) λi+1(C)λi(C )λi(C)
证毕

数学归纳法

基于上面的柯西中值定理,然后利用数学归纳法证明:

∑ i = 1 n − 1 ( λ i ( Λ A ) − λ n ( Λ A ) ) C i , i ≥ ∑ i = 1 n − 1 ( λ i ( Λ A ) − λ n ( Λ A ) ) λ n − i ( C ~ ) \sum_{i=1}^{n-1}( \lambda_i(\boldsymbol{\Lambda}_{\mathbf{A}})-\lambda_n(\boldsymbol{\Lambda}_{\mathbf{A}}))\mathbf{C}_{i, i}\geq\sum_{i=1}^{n-1}(\lambda_i(\boldsymbol{\Lambda}_{\mathbf{A}})-\lambda_n(\boldsymbol{\Lambda}_{\mathbf{A}})) \lambda_{n-i}(\widetilde{\mathbf{C}}) i=1n1(λi(ΛA)λn(ΛA))Ci,ii=1n1(λi(ΛA)λn(ΛA))λni(C )

证明:

定义对角矩阵 Λ A ′ ∈ S + + k + 1 \boldsymbol{\Lambda}^{'}_{\mathbf{A}} \in {\mathbb{S}_{++}^{k+1}} ΛAS++k+1 C ′ ∈ S + + k + 1 \mathbf{C}^{'} \in {\mathbb{S}_{++}^{k+1}} CS++k+1,其中对角阵 Λ A ∈ S + + k \boldsymbol{\Lambda}_{\mathbf{A}}\in \mathbb{S}_{++}^{k} ΛAS++k C ∈ S + + k \mathbf{C}\in \mathbb{S}_{++}^{k} CS++k分别是 Λ A ′ \boldsymbol{\Lambda}^{'}_{\mathbf{A}} ΛA C ′ \mathbf{C}^{'} C k k k维主子矩阵,因此对于任意的 1 ≤ i ≤ k 1 \leq i \leq k 1ik,有 λ i ( Λ A ′ ) = λ i ( Λ A ) \lambda_i(\boldsymbol{\Lambda}^{'}_{\mathbf{A}})=\lambda_i(\boldsymbol{\Lambda}_{\mathbf{A}}) λi(ΛA)=λi(ΛA) C i , i ′ = C i , i \mathbf{C}_{i, i}^{\prime}=\mathbf{C}_{i, i} Ci,i=Ci,i成立。

n = 2 n=2 n=2时,显然成立。

假定当 n = k n=k n=k时成立,有:
∑ i = 1 k − 1 ( λ i ( Λ A ) − λ k ( Λ A ) ) C i , i ≥ ∑ i = 1 k − 1 ( λ i ( Λ A ) − λ k ( Λ A ) ) λ k − i ( C ~ ) \sum_{i=1}^{k-1}\left(\lambda_i(\boldsymbol{\Lambda}_{\mathbf{A}})-\lambda_k(\boldsymbol{\Lambda}_{\mathbf{A}})\right) \mathbf{C}_{i, i} \geq \sum_{i=1}^{k-1}\left(\lambda_i(\boldsymbol{\Lambda}_{\mathbf{A}})-\lambda_k(\boldsymbol{\Lambda}_{\mathbf{A}})\right) \lambda_{k-i}(\widetilde{\mathbf{C}}) i=1k1(λi(ΛA)λk(ΛA))Ci,ii=1k1(λi(ΛA)λk(ΛA))λki(C )
则当 n = k + 1 n=k+1 n=k+1时有:
∑ i = 1 k ( λ i ( Λ A ′ ) − λ k + 1 ( Λ A ′ ) ) C i , i ′ = ∑ i = 1 k ( λ i ( Λ A ′ ) − λ k ( Λ A ′ ) + λ k ( Λ A ′ ) − λ k + 1 ( A ′ ) ) C i , i ′ = ∑ i = 1 k − 1 ( λ i ( Λ A ′ ) − λ k ( Λ A ′ ) ) C i , i ′ + ( λ k ( Λ A ′ ) − λ k + 1 ( Λ A ′ ) ) ∑ i = 1 k C i , i ′ = ∑ i = 1 k − 1 ( λ i ( Λ A ) − λ k ( Λ A ) ) C i , i + ( λ k ( Λ A ) − λ k + 1 ( Λ A ′ ) ) ∑ i = 1 k C i , i ′ ≥ ∑ i = 1 k − 1 ( λ i ( Λ A ) − λ k ( Λ A ) ) λ k − i ( C ~ ) + ( λ k ( Λ A ) − λ k + 1 ( Λ A ′ ) ) ∑ i = 1 k C i , i ′ ≥ ∑ i = 1 k − 1 ( λ i ( Λ A ′ ) − λ k ( Λ A ′ ) ) λ k − i + 1 ( C ) + ( λ k ( Λ A ′ ) − λ k + 1 ( Λ A ′ ) ) ∑ i = 1 k λ k − i + 1 ( C ) = ∑ i = 1 k ( λ i ( Λ A ′ ) − λ k + 1 ( Λ A ′ ) ) λ k − i + 1 ( C ) \begin{aligned} & \sum_{i=1}^k(\lambda_i(\boldsymbol{\Lambda}^{'}_{\mathbf{A}})-\lambda_{k+1}(\boldsymbol{\Lambda}^{'}_{\mathbf{A}} )) \mathbf{C}_{i, i}^{\prime} \\ =& \sum_{i=1}^k(\lambda_i(\boldsymbol{\Lambda}^{'}_{\mathbf{A}})-\lambda_{k}(\boldsymbol{\Lambda}^{'}_{\mathbf{A}} )+\lambda_{k}(\boldsymbol{\Lambda}^{'}_{\mathbf{A}} )-\lambda_{k+1}(\mathbf{A}^{\prime})) \mathbf{C}_{i, i}^{\prime} \\ =& \sum_{i=1}^{k-1}(\lambda_i(\boldsymbol{\Lambda}^{'}_{\mathbf{A}})-\lambda_{k}(\boldsymbol{\Lambda}^{'}_{\mathbf{A}} )) \mathbf{C}_{i, i}^{\prime}+(\lambda_k(\boldsymbol{\Lambda}^{'}_{\mathbf{A}})-\lambda_{k+1}(\boldsymbol{\Lambda}^{'}_{\mathbf{A}} )) \sum_{i=1}^k \mathbf{C}_{i, i}^{\prime}\\=& \sum_{i=1}^{k-1}(\lambda_i(\boldsymbol{\Lambda}_{\mathbf{A}})-\lambda_{k}(\boldsymbol{\Lambda}_{\mathbf{A}} )) \mathbf{C}_{i, i}+(\lambda_k(\boldsymbol{\Lambda}_{\mathbf{A}})-\lambda_{k+1}(\boldsymbol{\Lambda}^{'}_{\mathbf{A}} )) \sum_{i=1}^k \mathbf{C}_{i, i}^{\prime}\\\geq&\sum_{i=1}^{k-1}(\lambda_i(\boldsymbol{\Lambda}_{\mathbf{A}})-\lambda_k(\boldsymbol{\Lambda}_{\mathbf{A}})) \lambda_{k-i}(\widetilde{\mathbf{C}})+(\lambda_k(\boldsymbol{\Lambda}_{\mathbf{A}})-\lambda_{k+1}(\boldsymbol{\Lambda}^{'}_{\mathbf{A}} )) \sum_{i=1}^k \mathbf{C}_{i, i}^{\prime}\\\geq&\sum_{i=1}^{k-1}(\lambda_i(\boldsymbol{\Lambda}^{'}_{\mathbf{A}})-\lambda_{k}(\boldsymbol{\Lambda}^{'}_{\mathbf{A}} )) \lambda_{k-i+1}({\mathbf{C}})+(\lambda_k(\boldsymbol{\Lambda}_{\mathbf{A}}^{\prime})-\lambda_{k+1}(\boldsymbol{\Lambda}^{'}_{\mathbf{A}} )) \sum_{i=1}^k \lambda_{k-i+1}(\mathbf{C})\\=&\sum_{i=1}^{k}(\lambda_i(\boldsymbol{\Lambda}^{'}_{\mathbf{A}})-\lambda_{k+1}(\boldsymbol{\Lambda}^{'}_{\mathbf{A}} )) \lambda_{k-i+1}({\mathbf{C}}) \end{aligned} ====i=1k(λi(ΛA)λk+1(ΛA))Ci,ii=1k(λi(ΛA)λk(ΛA)+λk(ΛA)λk+1(A))Ci,ii=1k1(λi(ΛA)λk(ΛA))Ci,i+(λk(ΛA)λk+1(ΛA))i=1kCi,ii=1k1(λi(ΛA)λk(ΛA))Ci,i+(λk(ΛA)λk+1(ΛA))i=1kCi,ii=1k1(λi(ΛA)λk(ΛA))λki(C )+(λk(ΛA)λk+1(ΛA))i=1kCi,ii=1k1(λi(ΛA)λk(ΛA))λki+1(C)+(λk(ΛA)λk+1(ΛA))i=1kλki+1(C)i=1k(λi(ΛA)λk+1(ΛA))λki+1(C)

  • :第一个不等式利用了 n = k n=k n=k时成立的不等式,第二个不等式利用了前面的柯西中值定理。
    即:
    ∑ i = 1 k ( λ i ( Λ A ′ ) − λ k + 1 ( Λ A ′ ) ) C i , i ′ ≥ ∑ i = 1 k ( λ i ( Λ A ′ ) − λ k + 1 ( Λ A ′ ) ) λ k − i + 1 ( C ) \sum_{i=1}^k(\lambda_i(\boldsymbol{\Lambda}^{'}_{\mathbf{A}})-\lambda_{k+1}(\boldsymbol{\Lambda}^{'}_{\mathbf{A}} )) \mathbf{C}_{i, i}^{\prime}\geq\sum_{i=1}^{k}(\lambda_i(\boldsymbol{\Lambda}^{'}_{\mathbf{A}})-\lambda_{k+1}(\boldsymbol{\Lambda}^{'}_{\mathbf{A}} )) \lambda_{k-i+1}({\mathbf{C}}) i=1k(λi(ΛA)λk+1(ΛA))Ci,ii=1k(λi(ΛA)λk+1(ΛA))λki+1(C)

证毕。

冯诺依曼迹不等式的证明

A \mathbf{A} A A \mathbf{A} A做特征值分解分别有: A = U Λ A U T \mathbf{A}=\mathbf{U} \boldsymbol{\Lambda}_{\mathbf{A}} \mathbf{U}^{\mathrm{T}} A=UΛAUT B = V Λ B V T \mathbf{B}=\mathbf{V} \boldsymbol{\Lambda}_{\mathbf{B}} \mathbf{V}^{\mathrm{T}} B=VΛBVT,则:
tr ⁡ ( A B ) = tr ⁡ ( U Λ A U T V Λ B V T ) = tr ⁡ ( Λ A U T V ⏟ Q Λ B V T U ⏟ Q T ) = tr ⁡ ( Λ A Q Λ B Q T ⏟ C ) = tr ⁡ ( Λ A C ) = ∑ i = 1 n λ i ( Λ A ) C i , i = ∑ i = 1 n λ i ( A ) C i , i \operatorname{tr}(\mathbf{A B})=\operatorname{tr}\left(\mathbf{U} \boldsymbol{\Lambda}_{\mathbf{A}} \mathbf{U}^{\mathrm{T}} \mathbf{V} \boldsymbol{\Lambda}_{\mathbf{B}} \mathbf{V}^{\mathrm{T}}\right)=\operatorname{tr}(\boldsymbol{\Lambda}_{\mathbf{A}} \underbrace{\mathbf{U}^{\mathrm{T}} \mathbf{V}}_{\mathbf{Q}} \boldsymbol{\Lambda}_{\mathbf{B}} \underbrace{\mathbf{V}^{\mathrm{T}} \mathbf{U}}_{\mathbf{Q}^{\mathrm{T}}})=\operatorname{tr}(\underbrace{\boldsymbol{\Lambda}_{\mathrm{A}} \mathbf{Q} \boldsymbol{\Lambda}_{\mathbf{B}} \mathbf{Q}^{\mathrm{T}}}_{\mathrm{C}})=\operatorname{tr}\left(\boldsymbol{\Lambda}_{\mathbf{A}} \mathbf{C}\right)=\sum_{i=1}^n \lambda_i(\boldsymbol{\Lambda}_{\mathbf{A}}) \mathbf{C}_{i, i}=\sum_{i=1}^n \lambda_i(\mathbf{A}) \mathbf{C}_{i, i} tr(AB)=tr(UΛAUTVΛBVT)=tr(ΛAQ UTVΛBQT VTU)=tr(C ΛAQΛBQT)=tr(ΛAC)=i=1nλi(ΛA)Ci,i=i=1nλi(A)Ci,i

不难发现, Λ A \boldsymbol{\Lambda}_{\mathbf{A}} ΛA A \mathbf{A} A有相同的特征值, C \mathbf{C} C B \mathbf{B} B有相同的特征值。

∑ i = 1 n λ i ( Λ A ) C i , i = ∑ i = 1 n − 1 λ i ( Λ A ) C i , i + λ n ( Λ A ) ( ∑ i = 1 n C i , i − ∑ i = 1 n − 1 C i , i ) = ∑ i = 1 n − 1 ( λ i ( Λ A ) − λ n ( Λ A ) ) C i , i + λ n ( Λ A ) ∑ i = 1 n C i , i ≥ ∑ i = 1 n − 1 ( λ i ( Λ A ) − λ n ( Λ A ) ) λ n − i ( C ~ ) + λ n ( Λ A ) ∑ i = 1 n C i , i ≥ ∑ i = 1 n − 1 ( λ i ( Λ A ) − λ n ( Λ A ) ) λ n − i + 1 ( C ) + λ n ( Λ A ) ∑ i = 1 n λ n − i + 1 ( C ) = ∑ i = 1 n λ i ( Λ A ) λ n − i + 1 ( C ) = ∑ i = 1 n λ i ( A ) λ n − i + 1 ( B ) \begin{aligned} \sum_{i=1}^n \lambda_i(\boldsymbol{\Lambda}_{\mathbf{A}}) \mathbf{C}_{i, i}&=\sum_{i=1}^{n-1}\lambda_i(\boldsymbol{\Lambda}_{\mathbf{A}}) \mathbf{C}_{i, i}+\lambda_n(\boldsymbol{\Lambda}_{\mathbf{A}}) (\sum_{i=1}^n\mathbf{C}_{i, i}-\sum_{i=1}^{n-1}\mathbf{C}_{i, i})\\&=\sum_{i=1}^{n-1}( \lambda_i(\boldsymbol{\Lambda}_{\mathbf{A}})-\lambda_n(\boldsymbol{\Lambda}_{\mathbf{A}}))\mathbf{C}_{i, i}+\lambda_n(\boldsymbol{\Lambda}_{\mathbf{A}})\sum_{i=1}^n\mathbf{C}_{i, i}\\&\geq \sum_{i=1}^{n-1}(\lambda_i(\boldsymbol{\Lambda}_{\mathbf{A}})-\lambda_n(\boldsymbol{\Lambda}_{\mathbf{A}})) \lambda_{n-i}(\widetilde{\mathbf{C}})+\lambda_n(\boldsymbol{\Lambda}_{\mathbf{A}})\sum_{i=1}^n\mathbf{C}_{i, i}\\&\geq\sum_{i=1}^{n-1}(\lambda_i(\boldsymbol{\Lambda}_{\mathbf{A}})-\lambda_n(\boldsymbol{\Lambda}_{\mathbf{A}})) \lambda_{n-i+1}({\mathbf{C}})+\lambda_n(\boldsymbol{\Lambda}_{\mathbf{A}})\sum_{i=1}^n\lambda_{n-i+1}(\mathbf{C})\\&=\sum_{i=1}^{n}\lambda_i(\boldsymbol{\Lambda}_{\mathbf{A}}) \lambda_{n-i+1}({\mathbf{C}})=\sum_{i=1}^n \lambda_i(\mathbf{A})\lambda_{n-i+1}(\mathbf{B}) \end{aligned} i=1nλi(ΛA)Ci,i=i=1n1λi(ΛA)Ci,i+λn(ΛA)(i=1nCi,ii=1n1Ci,i)=i=1n1(λi(ΛA)λn(ΛA))Ci,i+λn(ΛA)i=1nCi,ii=1n1(λi(ΛA)λn(ΛA))λni(C )+λn(ΛA)i=1nCi,ii=1n1(λi(ΛA)λn(ΛA))λni+1(C)+λn(ΛA)i=1nλni+1(C)=i=1nλi(ΛA)λni+1(C)=i=1nλi(A)λni+1(B)

即: tr ⁡ ( A B ) ≥ ∑ i = 1 n λ i ( A ) λ n − i + 1 ( B ) \operatorname{tr}(\mathbf{A B})\geq\sum_{i=1}^n \lambda_i(\mathbf{A})\lambda_{n-i+1}(\mathbf{B}) tr(AB)i=1nλi(A)λni+1(B)

  • :第一个不等式利用了数学归纳法证明的不等式,第二个不等式利用了前面的柯西中值定理。

你可能感兴趣的:(数学工具,线性代数,算法)