西瓜书公式(10.24)的推导

在西瓜书 10.4 节 “核化线性降维” 中,引入了一个映射函数 ϕ \phi ϕ,其作用是将样本点 x i x_i xi 映射到高维特征空间中,即 z i = ϕ ( x i ) z_i=\phi(x_i) zi=ϕ(xi)
由前文中的推导可以得到 式(10.21)式(10.22)
( ∑ i = 1 m ϕ ( x i ) ϕ ( x i ) T ) w j = λ j w j (10.21) \left(\sum_{i=1}^m \phi(x_i)\phi(x_i)^T\right)w_j= \lambda_jw_j\tag{10.21} (i=1mϕ(xi)ϕ(xi)T)wj=λjwj(10.21)
w j w_j wj 是高维空间中的一个标准正交基
w j = ∑ i = 1 m ϕ ( x i ) α i j (10.22) w_j=\sum_{i=1}^m\phi(x_i)\alpha_i^j\tag{10.22} wj=i=1mϕ(xi)αij(10.22)
其中 α i j = 1 λ j z i T w j \alpha_{i}^{j}=\frac{1}{\lambda{j}}z_i^Tw_j αij=λj1ziTwj
一般情形下,我们不清楚 ϕ \phi ϕ 的具体形式,于是引入核函数
κ ( x i , x j ) = ϕ ( x i ) T ϕ ( x j ) (10.23) \kappa(x_i,x_j)=\phi(x_i)^T\phi(x_j) \tag{10.23} κ(xi,xj)=ϕ(xi)Tϕ(xj)(10.23)
式(10.22)式(10.23) 代入 式(10.21) 后可得
K α j = λ j α j (10.24) K\alpha^j=\lambda_j\alpha^j \tag{10.24} Kαj=λjαj(10.24)
其中 K K K κ \kappa κ 对应的核矩阵, ( K ) i j = κ ( x i , x j ) (K)_{ij}=\kappa(x_i,x_j) (K)ij=κ(xi,xj), α j = ( α 1 j ; α 2 j ; . . . ; α m j ) \alpha^j=(\alpha^j_1;\alpha^j_2;...;\alpha^j_m) αj=(α1j;α2j;...;αmj).

下面我们来推导 式(10.24):

( ∑ i = 1 m ϕ ( x i ) ϕ ( x i ) T ) ( ∑ k = 1 m ϕ ( x k ) α k j ) = λ j ∑ i = 1 m ϕ ( x i ) α i j (10.22 代入 10.21) \left(\sum_{i=1}^m \phi(x_i)\phi(x_i)^T\right) \left(\sum_{k=1}^m\phi(x_k)\alpha_k^j\right)= \lambda_j\sum_{i=1}^m\phi(x_i)\alpha_i^j \tag{10.22 代入 10.21} (i=1mϕ(xi)ϕ(xi)T)(k=1mϕ(xk)αkj)=λji=1mϕ(xi)αij(10.22 代入 10.21)
∑ k = 1 m ( ∑ i = 1 m ϕ ( x i ) ϕ ( x i ) T ) ϕ ( x k ) α k j = λ j ∑ i = 1 m ϕ ( x i ) α i j (分配率) \sum_{k=1}^m \left(\sum_{i=1}^m \phi(x_i) \phi(x_i)^T \right)\phi(x_k)\alpha_k^j= \lambda_j\sum_{i=1}^m\phi(x_i)\alpha_i^j \tag{分配率} k=1m(i=1mϕ(xi)ϕ(xi)T)ϕ(xk)αkj=λji=1mϕ(xi)αij(分配率)

∑ k = 1 m ( ∑ i = 1 m ϕ ( x i ) ϕ ( x i ) T ϕ ( x k ) α k j ) = λ j ∑ i = 1 m ϕ ( x i ) α i j (分配率) \sum_{k=1}^m \left(\sum_{i=1}^m \phi(x_i) \phi(x_i)^T \phi(x_k)\alpha_k^j\right)= \lambda_j\sum_{i=1}^m\phi(x_i)\alpha_i^j \tag{分配率} k=1m(i=1mϕ(xi)ϕ(xi)Tϕ(xk)αkj)=λji=1mϕ(xi)αij(分配率)
∑ k = 1 m ( ∑ i = 1 m ϕ ( x i ) κ ( x i , x k ) α k j ) = λ j ∑ i = 1 m ϕ ( x i ) α i j (代入 10.23) \sum_{k=1}^m \left(\sum_{i=1}^m \phi(x_i) \kappa(x_i,x_k) \alpha_k^j\right)= \lambda_j\sum_{i=1}^m\phi(x_i)\alpha_i^j \tag{代入 10.23} k=1m(i=1mϕ(xi)κ(xi,xk)αkj)=λji=1mϕ(xi)αij(代入 10.23)
∑ i = 1 m ϕ ( x i ) ∑ k = 1 m κ ( x i , x k ) α k j = λ j ∑ i = 1 m ϕ ( x i ) α i j (交换求和符号) \sum_{i=1}^m \phi(x_i) \sum_{k=1}^m\kappa(x_i,x_k) \alpha_k^j= \lambda_j\sum_{i=1}^m\phi(x_i)\alpha_i^j \tag{交换求和符号} i=1mϕ(xi)k=1mκ(xi,xk)αkj=λji=1mϕ(xi)αij(交换求和符号)
∑ i = 1 m ϕ ( x i ) ( K α j ) i = λ j ∑ i = 1 m ϕ ( x i ) α i j (矩阵乘法) \sum_{i=1}^m \phi(x_i) (K\alpha^j)_i= \lambda_j\sum_{i=1}^m\phi(x_i)\alpha_i^j \tag{矩阵乘法} i=1mϕ(xi)(Kαj)i=λji=1mϕ(xi)αij(矩阵乘法)
Φ ( K α j ) = λ j Φ α j (矩阵乘法) \Phi \left( K\alpha^j\right)= \lambda_j\Phi \alpha^j \tag{矩阵乘法} Φ(Kαj)=λjΦαj(矩阵乘法)
其中 Φ = ( ϕ ( x 1 ) , ϕ ( x 2 ) , . . . , ϕ ( x m ) ) \Phi=(\phi(x_1),\phi(x_2),...,\phi(x_m)) Φ=(ϕ(x1),ϕ(x2),...,ϕ(xm))
K α j = λ j α j ( 两边同时乘以  Φ − 1 ) K\alpha^j= \lambda_j \alpha^j \tag{两边同时乘以 $\Phi^{-1}$} Kαj=λjαj(两边同时乘以 Φ1)
证毕。

最后,为了帮助理解,上述各变量的维度如下:
α i j ∈ R 1 × 1 α j ∈ R m × 1 K ∈ R m × m K α j ∈ R m × 1 ( K α j ) i ∈ R 1 × 1 ϕ ( x i ) ∈ R d × 1 Φ ∈ R d × m Φ ( K α j ) ∈ R d × 1 \begin{aligned} \alpha^j_i &\in \mathbb{R}^{1\times1} \\ \alpha^j &\in \mathbb{R}^{m \times 1} \\ K &\in \mathbb{R}^{m \times m} \\ K\alpha^j &\in \mathbb{R}^{m \times 1} \\ \left(K\alpha^j\right)_i &\in \mathbb{R}^{1 \times 1} \\ \phi(x_i) &\in \mathbb{R}^{d \times 1} \\ \Phi &\in \mathbb{R}^{d \times m} \\ \Phi \left( K \alpha^j \right) &\in \mathbb{R}^{d \times 1} \\ \end{aligned} αijαjKKαj(Kαj)iϕ(xi)ΦΦ(Kαj)R1×1Rm×1Rm×mRm×1R1×1Rd×1Rd×mRd×1

你可能感兴趣的:(AI,ML,机器学习,人工智能)