矩阵的迹和导数的主要性质

主要公式:

Trace:

trAtrAtraAtrA+BtrABtrABCtrABCD=======i=1naiitrATatrAtrA+trBtrBAtrBCA=trCABtrDABC=trCDAB=trBCDA(26)(27)(28)(29)(30)(31)(32) (26) t r A = ∑ i = 1 n a i i (27) t r A = t r A T (28) t r a A = a t r A (29) t r A + B = t r A + t r B (30) t r A B = t r B A (31) t r A B C = t r B C A = t r C A B (32) t r A B C D = t r D A B C = t r C D A B = t r B C D A

Derivative:

dXTdX=dXdXTd(AX)TdXd(AX)dXTd(XTX)dXd(XTAX)dX=====EATA2X(A+AT)X(33)(34)(35)(36)(37) (33) d X T d X = d X d X T = E (34) d ( A X ) T d X = A T (35) d ( A X ) d X T = A (36) d ( X T X ) d X = 2 X (37) d ( X T A X ) d X = ( A + A T ) X

Gradient:

假设有函数 f:Rm×nR f : R m × n → R ,将 m×n m × n 的矩阵映射到实数 R R 空间去,则我们定义 f f 关于矩阵 A A 的导数为:

Af(A)=fa11fan1fa1nfann ∇ A f ( A ) = [ ∂ f ∂ a 11 ⋯ ∂ f ∂ a 1 n ⋮ ⋱ ⋮ ∂ f ∂ a n 1 ⋯ ∂ f ∂ a n n ]

AtrABA|A|ATf(A)AtrABATC====BT|A|(A1)T(Af(A))TCAB+CTABT(38)(39)(40)(41) (38) ∇ A t r A B = B T (39) ∇ A | A | = | A | ( A − 1 ) T (40) ∇ A T f ( A ) = ( ∇ A f ( A ) ) T (41) ∇ A t r A B A T C = C A B + C T A B T

公式证明

(1) trAB=trBA t r A B = t r B A

假设

A=a11an1a1nannB=b11bn1b1nbnn A = [ a 11 … a 1 n ⋮ ⋱ ⋮ a n 1 … a n n ] B = [ b 11 … b 1 n ⋮ ⋱ ⋮ b n 1 … b n n ]

可得

trABtrBA==i=1nj=1naijbjij=1ni=1naijbji(96)(97) (96) t r A B = ∑ i = 1 n ∑ j = 1 n a i j b j i (97) t r B A = ∑ j = 1 n ∑ i = 1 n a i j b j i

同理得式(6)、(7)

(2) AtrAB=BT ∇ A t r A B = B T

又因为 trAB=i=1nj=1naijbji t r A B = ∑ i = 1 n ∑ j = 1 n a i j b j i ,对于A 中的每一个 aij a i j 都有:

dtrABdaij=bji(98) (98) d t r A B d a i j = b j i

又因为 B B 由式(18)定义,所以 AtrAB=BT ∇ A t r A B = B T

(3) A|A|=|A|(A1)T ∇ A | A | = | A | ( A − 1 ) T

由行列式的性质得: |A|=jaijAij | A | = ∑ j a i j A i j ,其中 Aij A i j 矩阵 (i,j) ( i , j ) 处的代数余子式。所以:

|A|aijA|A|==AijAij=(A)T=(|A|A1)T=|A|(A1)T(99)(100) (99) ∂ | A | ∂ a i j = A i j (100) ⟹ ∇ A | A | = A i j = ( A ∗ ) T = ( | A | A − 1 ) T = | A | ( A − 1 ) T

(4) ATf(A)=(Af(A))T ∇ A T f ( A ) = ( ∇ A f ( A ) ) T

等号左边对于A 中的每一个 aij a i j 都有: df(A)daji d f ( A ) d a j i

等号右边对于A 中的每一个 aij a i j 可表示为: df(A)daij d f ( A ) d a i j

可以发现他们正好是转置的关系。

(5) AtrABATC=CAB+CTABT ∇ A t r A B A T C = C A B + C T A B T

AtrABATC====AtrA(BATC)+Atr(CAB)AT(BATC)T+Atr[(CAB)AT]TCTABT+AtrA(CAB)CTABT+CAB(101)(102)(103)(104) (101) ∇ A t r A B A T C = ∇ A t r A ( B A T C ) + ∇ A t r ( C A B ) A T (102) = ( B A T C ) T + ∇ A t r [ ( C A B ) A T ] T (103) = C T A B T + ∇ A t r A ( C A B ) (104) = C T A B T + C A B

你可能感兴趣的:(线性代数)