实验代码获取 github repo
山东大学机器学习课程资源索引
两种方法EVD-PCA和SVD-PCA的实现、效率对比见我之前的博客一个PCA加速技巧,这里补充SVD方法的数学推导:
首先,设方阵 A A A的特征值分解为 A = U Σ U T A=U\Sigma U^T A=UΣUT,
对于矩阵 A A A尝试构建一种分解
A = U Σ V T A=U\Sigma V^T A=UΣVT
其中 U T U = I , V T V = I U^TU=I,V^TV=I UTU=I,VTV=I,
效仿特征值分解,构造方阵 A T A A^TA ATA和 A A T AA^T AAT,
A T A = ( U Σ V T ) T U Σ V T = V Σ U T U Σ V T = V Σ 1 2 V T A^TA=(U\Sigma V^T)^TU\Sigma V^T=V\Sigma U^TU\Sigma V^T=V\Sigma_1^2V^T ATA=(UΣVT)TUΣVT=VΣUTUΣVT=VΣ12VT
这里我们得到了方阵 A T A A^TA ATA的特征值分解!因此,对 A T A A^TA ATA进行特征值分解就可以得到 V V V.
同理, A A T = U Σ 2 2 U T AA^T=U\Sigma_2^2U^T AAT=UΣ22UT,对 A A T AA^T AAT进行特征值分解就可以得到 U U U.
下面证明对角矩阵 Σ 1 \Sigma_1 Σ1和 Σ 2 \Sigma_2 Σ2中的非零值相等。
设线性方程组 A x = 0 Ax=0 Ax=0,左乘 A T A^T AT得 A T A x = 0 A^TAx=0 ATAx=0,因此 A x = 0 Ax=0 Ax=0的解也是 A T A x = 0 A^TAx=0 ATAx=0的解。
对于 A T A x = 0 A^TAx=0 ATAx=0,左乘 x T x^T xT得 x T A T A x = 0 ⟹ ( A x ) T ( A x ) = 0 ⟹ A x = 0 x^TA^TAx=0\Longrightarrow(Ax)^T(Ax)=0\Longrightarrow Ax=0 xTATAx=0⟹(Ax)T(Ax)=0⟹Ax=0,即反过来 A T A x = 0 A^TAx=0 ATAx=0的解也是 A x = 0 Ax=0 Ax=0的解。
综上, A x = 0 Ax=0 Ax=0与 A T A x = 0 A^TAx=0 ATAx=0有相同的解空间,可得 r ( A ) = r ( A T A ) r(A)=r(A^TA) r(A)=r(ATA),又有 r ( A ) = r ( A T ) r(A)=r(A^T) r(A)=r(AT),因此
r ( A A T ) = r ( A T ) = r ( A ) = r ( A T A ) r(AA^T)=r(A^T)=r(A)=r(A^TA) r(AAT)=r(AT)=r(A)=r(ATA)
即矩阵 A T A A^TA ATA和 A A T AA^T AAT同秩。
设 x x x是 A T A A^TA ATA对应特征值 λ \lambda λ的特征向量,所以有 A T A x = λ x A^TAx=\lambda x ATAx=λx,两边同乘 A A A,得 A A T A x = λ A x AA^TAx=\lambda Ax AATAx=λAx, A A T ( A x ) = λ ( A x ) AA^T(Ax)=\lambda (Ax) AAT(Ax)=λ(Ax),因此矩阵 A T A A^TA ATA和 A A T AA^T AAT的非零特征值相等。
而且可以得到特征值为奇异值的平方, λ i = σ i 2 \lambda_i=\sigma_i^2 λi=σi2.
利用 A A T AA^T AAT的分解构建对 A T A A^TA ATA的分解
矩阵 A A A的奇异值分解为 A = U Σ V T A=U\Sigma V^T A=UΣVT,两边乘 V V V,得到 A V = U Σ AV=U\Sigma AV=UΣ,每一列有 A v i = σ i u i Av_i=\sigma_iu_i Avi=σiui,即
u i = A v i λ i u_i=\frac{Av_i}{\sqrt{\lambda_i}} ui=λiAvi
其中, v i v_i vi和 λ i \lambda_i λi可由对 A T A A^TA ATA的分解得到。
反之同样可行。
function [corrRate, misclass] = ovoMultiClassModel(trainData, testData, classNum, K, trainNum)
fprintf('Running one vs one multiclass model...\n');
%% Step 1: Assign label
testNum = 10 - trainNum;
N = classNum * (classNum - 1) / 2;
trainPart = cell(N, 1);
% testPart = cell(N, 1);
cnt = 1;
map = zeros(N, 2); % 第cnt个二分类器的对抗分类类别i,j
for i = 1 : (classNum - 1)
for j = (i + 1) : classNum
trainPart{cnt, 1} = [[trainData((i - 1) * trainNum + 1 : i * trainNum, :), ones(trainNum, 1)]; ...
[trainData((j - 1) * trainNum + 1 : j * trainNum, :), -ones(trainNum, 1)]];
% testPart{cnt, 1} = [[testData((i - 1) * testNum + 1 : i * testNum, :), ones(testNum, 1)]; ...
% [testData((j - 1) * testNum + 1 : j * testNum, :), -ones(testNum, 1)]];
map(cnt, 1) = i; map(cnt, 2) = j;
cnt = cnt + 1;
end
end
% save('trainPart.mat', 'trainPart');
%% Step 2: Train SVM
A = zeros(N, 2 * trainNum); % ? num(alpha)=2*trainNum
W = zeros(N, size(trainData, 2)); % N×K
B = zeros(N, 1);
for i = 1 : N
SVMObj = mySVM(trainPart{i, 1}(:, (1 : K)), trainPart{i, 1}(:, K + 1), 1);
A(i, :) = SVMObj.alpha';
W(i, :) = SVMObj.w';
B(i) = SVMObj.b;
end
%% Step 3: Test classifying
% SVM函数输出值
totalTest = size(testData, 1);
val = zeros(totalTest, N);
for i = 1 : N
val(:, i) = (testData * W(i, :)')' + B(i);
% (?×1)=((?×K)*(1×K)')'+(?×1),其中?为测试数据个数(totalTest)
end
% 第i类别参与(40×39/2)个二分类SVM中的39个
part = zeros(classNum, classNum - 1);
for i = 1 : classNum
part(i, :) = find(map(:, 1) == i | map(:, 2) == i);
end
% f(x)=arg max_{s}∑_{t}f_{s,t}(x)
%% A. 一对一SVM分类器结果
res = zeros(totalTest, 1);
for i = 1 : totalTest % 对所有测试数据
voteCnt = zeros(classNum, 1);
for j = 1 : classNum % 对所有可能分类
for k = 1 : classNum - 1
if val(i, part(j, k)) > 0 && map(part(j, k), 1) == j
voteCnt(j) = voteCnt(j) + 1;
elseif val(i, part(j, k)) < 0 && map(part(j, k), 2) == j
voteCnt(j) = voteCnt(j) + 1;
end
end
end
[~, maxOne] = max(voteCnt);
res(i) = maxOne;
end
%% B. 标准结果
std = zeros(totalTest, 1);
for i = 1 : classNum
std((i - 1) * testNum + 1 : i * testNum) = i;
end
%% C. 对比统计
corrSet = find(res == std);
corrRate = length(corrSet) / totalTest;
misclass = cell(1, 1);
misclass{1} = [std, res];
fprintf('one vs one multiclass done\n');
end
function SVMObj = mySVM(x, y, C)
m = size(x, 1);
% !err:length=max(size(X)),返回数组最大维度长度
options = optimset;
options.largeScale = 'off';
options.Display = 'off';
%% A. 构建目标函数
H = zeros(m);
for i = 1 : m
for j = 1 : m
H(i, j) = y(i) * y (j) * x(i, :) * x(j, :)';
end
end
f = (-1) * ones(m, 1);
%% B. 构建约束
Aeq = y';
beq = 0;
lb = zeros(m, 1);
ub = zeros(m, 1);
ub(:) = C;
a0 = zeros(m, 1); % 迭代初始值
%% C. 利用quadprog求解器求解对偶问题
% quadprog(H,f,A,b,Aeq,beq,lb,ub)
[alpha, fval] = quadprog(H, f, [], [], Aeq, beq, lb, ub, a0, options);
%% D. 求support vector
alpha(find(alpha < 1e-8)) = 0;
sv = find(alpha > 0 & alpha < C);
w = 0; % omega
for i = 1 : length(sv)
w = w + alpha(sv(i)) * y(sv(i)) * x (sv(i), :)';
end
num = y - x * w;
b = sum(num(sv)) / length(sv);
%% 构建返回对象SVMObj
SVMObj.alpha = alpha; % alpha(sv)
SVMObj.w = w;
SVMObj.b = b;
end