假设两个样本X、Y,它们的均值分别为 X ‾ \overline{X} X、 Y ‾ \overline{Y} Y,样本X和样本Y的协方差为:
C o v ( X , Y ) = ∑ i = 1 n ( X i − X ‾ ) ( Y i − Y ‾ ) n − 1 Cov(X,Y) = \frac{\sum_{i=1}^{n}(X_i-\overline{X})(Y_i-\overline{Y})}{n-1} Cov(X,Y)=n−1∑i=1n(Xi−X)(Yi−Y)
协方差为正时说明X和Y是正相关,协方差为负时X和Y是负相关1,协方差为0时X和Y相互独立。
若 X W = λ W XW=\lambda W XW=λW,则称 λ \lambda λ是X的特征值,W是对应的特征向量。 X W XW XW的结果等同于 W W W按系数 λ \lambda λ的缩放。当X是n阶可逆对称矩阵时,存在正交 Q Q Q ( Q − 1 = Q T Q^{-1}=Q^T Q−1=QT),使得:
Q T X Q = ( λ 1 0 ⋯ 0 0 λ 2 ⋯ 0 ⋮ ⋮ ⋱ ⋮ 0 0 ⋯ λ n ) Q^T X Q = \begin {pmatrix} \lambda_1 & 0 & \cdots &0 \\ 0 & \lambda_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \lambda_n \end {pmatrix} QTXQ=⎝⎜⎜⎜⎛λ10⋮00λ2⋮0⋯⋯⋱⋯00⋮λn⎠⎟⎟⎟⎞
对矩阵X进行奇异值分解,就可以得到特征值和特征向量(Q的列向量)。
数据发生降维时会产生信息损失,同时希望损失尽可能小,降维标准为:样本到超平面的距离足够小或者样本在超平面的投影足够分散2,详细的介绍请看这里。
假设m个n维度数据 ( x 1 , x 2 , . . . , x m ) (x^{1},x^{2},...,x^{m}) (x1,x2,...,xm)是中心化后的数据,经过变换得到的新坐标系为 { w 1 , w 2 , . . . , w n } \{w_1,w_2,...,w_n\} {w1,w2,...,wn},其中w是标准正交基,满足 ∣ ∣ w ∣ ∣ 2 = 1 , w i T w j = 0 ||w||_2 =1,w_i^Tw_j =0 ∣∣w∣∣2=1,wiTwj=0这里丢弃部分数据,新的坐标系为: { w 1 , w 2 , . . . , w n ′ } \{w_1,w_2,...,w_{n'}\} {w1,w2,...,wn′},样本点 x i x^i xi在n’维度的新坐标系上的投影为: Z ( i ) = ( z 1 i , z 2 i , . . . , z n ′ i ) Z^{(i)}=(z_1^i,z_2^i,...,z_{n'}^i) Z(i)=(z1i,z2i,...,zn′i)
其中 z j i = w j T x i z_j^i = w_j^Tx^i zji=wjTxi是 x i x^i xi在低维坐标系中第J维的坐标值。若用 z ( i ) z^{(i)} z(i)来恢复原始数据 x ( i ) x^{(i)} x(i),得到:
x i ‾ = ∑ j = 1 n ′ z j i w j = W Z i \overline {x^i} = \sum_{j=1}^{n'}z_j^iw_j=WZ^i xi=j=1∑n′zjiwj=WZi
考虑整个样本集,我们希望样本到超平面足够近,即
m i n ∑ x i ‾ m ∣ ∣ x i ‾ − x i ∣ ∣ 2 2 min {\sum_{\overline {x^i}}^{m}||\overline {x^i}-x^i ||_2^2} minxi∑m∣∣xi−xi∣∣22
∑ x i ‾ m ∣ ∣ x i ‾ − x i ∣ ∣ 2 2 = ∑ i = 1 m ∣ ∣ W Z i − x i ∣ ∣ 2 2 = ∑ i = 1 m ( W Z i ) T ( W Z i ) − 2 ∑ i = 1 m ( W Z i ) T x i + ∑ i = 1 m x i T x i = ∑ i = 1 m ( Z i ) T Z i − 2 ∑ i = 1 m ( Z i ) T W T x i + ∑ i = 1 m x i T x i = ∑ i = 1 m ( Z i ) T Z i − 2 ∑ i = 1 m ( Z i ) T Z i + ∑ i = 1 m x i T x i = − ∑ i = 1 m ( Z i ) T Z i + ∑ i = 1 m x i T x i = − t r ( W T ( ∑ i = 1 m x i ( x i ) T ) W ) + ∑ i = 1 m ( x i ) T x i = − t r ( W T X T W ) + ∑ i = 1 m ( x i ) T x i \begin{aligned} \sum_{\overline {x^i}}^{m}||\overline {x^i}-x^i ||_2^2 &=\sum_{i=1}^{m}||WZ^i - x^i ||_2^2 \\ &=\sum_{i=1}^{m}(WZ^i)^T(WZ^i)-2\sum_{i=1}^{m}(WZ^i)^Tx^i +\sum_{i=1}^{m}{{x^i}^T}{x^i} \\&=\sum_{i=1}^{m}(Z^i)^TZ^i-2\sum_{i=1}^{m}(Z^i)^TW^Tx^i+\sum_{i=1}^{m}{{x^i}^T}{x^i} \\&= \sum_{i=1}^{m}(Z^i)^TZ^i-2\sum_{i=1}^{m}(Z^i)^TZ^i+\sum_{i=1}^{m}{{x^i}^T}{x^i} \\&=-\sum_{i=1}^{m}(Z^i)^TZ^i+\sum_{i=1}^{m}{{x^i}^T}{x^i}\\&=-tr(W^T(\sum_{i=1}^{m}x^i(x^i)^T)W)+\sum_{i=1}^{m}(x^i)^Tx^i\\&=-tr(W^TX^TW) + \sum_{i=1}^{m}(x^i)^Tx^i \end{aligned} xi∑m∣∣xi−xi∣∣22=i=1∑m∣∣WZi−xi∣∣22=i=1∑m(WZi)T(WZi)−2i=1∑m(WZi)Txi+i=1∑mxiTxi=i=1∑m(Zi)TZi−2i=1∑m(Zi)TWTxi+i=1∑mxiTxi=i=1∑m(Zi)TZi−2i=1∑m(Zi)TZi+i=1∑mxiTxi=−i=1∑m(Zi)TZi+i=1∑mxiTxi=−tr(WT(i=1∑mxi(xi)T)W)+i=1∑m(xi)Txi=−tr(WTXTW)+i=1∑m(xi)Txi
三、人脸识别中的PCA(matlab)
clc
clear
%读取 40 * 9 张图像,组成样本集。
img_all = zeros(360,10304);
for i = 1:40
for j = 1 : 9
if(i <= 10)
img_temp = imread(strcat('F:\Matlab\face_lib\ORL0',num2str(i-1),num2str(j),'.bmp'));
else
img_temp = imread(strcat('F:\Matlab\face_lib\ORL',num2str(i-1),num2str(j),'.bmp'));
end
img_all((i-1)*9 + j,:) = img_temp(1:112*92);
end
end
%计算样本均值
img_mean = mean(img_all);
imshow(mat2gray(reshape(img_mean,112,92)));
%样本中心化
img_norm = zeros(360,10304);
for k=1:360
img_norm(k,:) = img_all(k,:) - img_mean;
end
%计算样本的协方差矩阵
img_cov = img_norm * img_norm';
%计算协方差矩阵的特征向量和特征值
[vec,val] = eig(img_cov);
val1=diag(val);
[eigen_val_ascend,index]=sort(val1);
vsort = zeros(size(vec));
dsort = zeros(size(val1));
cols=size(vec,2);
for i=1:cols
vsort(:,i) = vec(:, index(cols-i+1) );
dsort(i) = val1( index(cols-i+1) );
end
dsum = sum(dsort);
dsum_extract = 0; p = 0;
while( dsum_extract/dsum < 0.8)
p = p + 1;
dsum_extract = sum(dsort(1:p));
end
featu_face = zeros(10304,43);
for i=1:p
featu_face(:,i) = dsort(i)^(-1/2) * img_norm' * vsort(:,i);
end
samp_dist = img_all * featu_face;
accu = 0;
a=imread(strcat('F:\Matlab\face_lib\ORL020.bmp'));
b=a(1:10304); b=double(b); tar_dist= b * featu_face;
tar_samp_dist = zeros(1,360);
for k=1:360
tar_samp_dist(k)=norm(tar_dist-samp_dist(k,:));
end;
[dist,index1]=sort(tar_samp_dist);
index1(1)
index1(2)
index1(3)
index1(4)
index1(5)
程序参考3
主成分分析PCA[http://www.cnblogs.com/zhangchaoyang/articles/2222048.html] ↩︎
主成分分析(PCA)原理总结[http://www.cnblogs.com/pinard/p/6239403.html] ↩︎
https://blog.csdn.net/u013146742/article/details/52463848 ↩︎