L1-PCA
Intro
PCA的本质就是从高维空间向低维空间投影,投影的本质又是左乘(或右乘)一个向量(表征原来特征空间到投影后特征空间的权重),经过线性加权,转换到低维空间表征,如果将向量换成矩阵(假设由m个向量组成),则是将原高维空间降维到m维空间上去。
L1-PCA解决的问题是outlier问题(一般PCA假设噪声服从高斯分布,如果有异常不服从,那么此时PCA的效果将非常差),一般PCA是对outlier比较敏感的,而L1-PCA不对outlier敏感.
PCA回顾
有必要从数学角度理解一下PCA。
就像上面我说的,PCA本质是做一种变换,一种变换往往可以通过矩阵乘积来表征,因此:
\[ 定义向量a \in R^{p \times 1},特征矩阵X\in R^{n \times p},那么将X降维到1维是相当简单:\\X' = Xa,x'\in R^{n \times 1} \]
我们学过信号处理我们知道(稍微后面一点),信号往往具有较大的方差,而噪声的方差是较小的,因此我们就很当然的认为,经过降维后的数据应该具有较大的方差。因此,下一步就是方差最大化:
\[ \sigma^2_a = (Xa)^T(Xa) = a^TX^TXa = a^TVa \]
下面的任务就成了最大化方差优化求解a的问题了,我们引入约束条件
\[ z = a^TVa - \lambda(a^Ta - 1)\\a^Ta = 1\\\frac{\partial z}{\partial a} = (V+V^T)a - 2 \lambda a = 2Va - 2\lambda a = 0 \\ 可以得到\\(V - \lambda I)a = 0 \]
显然a就是V的特征向量,那么后面我们对V进行特征值分解,就拿到这些向量a啦。
还有一个问题,就是保留更多的信息,也就是方差,所以计算每个成分的方差,从大到小排序取合适即可。
L1-PCA
L1-PCA的思想是,经过降维变换之后,新矩阵的L1范数应该足够大,L1范数表征的是所有行绝对值求和最大的列对应的值。
传统的PCA其实还有个等价形式(具体证明请查阅相关资料):
\[ \max_{W^TW=I} ||WX||_2^2 \]
也就是最大化二范数,而L1-PCA就是将L2范数换成了L1范数,这虽然是不等价的,但是却可以在结果上相似,并且L1范数试验下要更加对outlier鲁棒。
因此L1-PCA求解的问题就是
\[ \max_{W^TW=I} ||WX||_1 \]
由于直接求解这个问题是很困难的,所以我们通常通过贪婪法来求,也就是先求一个变换向量w,在求第二个变换向量,以此类推,一个一个求,而不是一次求完一个矩阵。求解一个变换向量w,问题就变成了:
\[ w^* = arg\max_w ||w^TX||_1 = arg \max_w \sum_1^{n}|w^Tx_i|,subject. to. ||w||_2=1 \]
优化过程为(t代表轮次):
\[ w(t+1) = \frac{\sum_{i=1}^{n}p_i(t)x_i}{||\sum_{i=1}^{n}p_i(t)x_i||_2}\\p_i(t)=\left\{\begin{aligned}1 & ,w^T(t)x_i \ge 0 \\-1 & ,w^T(t)x_i <0 \\\end{aligned}\right. \]
更新训练数据x
\[ x_i^m = x_i^{m-1} - w_{m-1}(w^T_{m-1}x_i^{m-1}),i = 1...n \]
具体的证明请见:
L1-PCA优化推导
Coding
很抱歉上一版程序由于我是半夜写的,图最后画错了,今天起来看到了,遂更正了。
'''
@Descripttion: This is Aoru Xue's demo, which is only for reference.
@version:
@Author: Aoru Xue
@Date: 2019-12-12 22:57:47
@LastEditors: Aoru Xue
@LastEditTime: 2019-12-14 00:21:12
'''
import numpy as np
import copy
from matplotlib import pyplot as plt
from numpy import linalg
class L1PCA():
def __init__(self,):
pass
def __call__(self,x,out_n = 2): # x (100,16)
w = np.ones(shape = (x.shape[0],out_n))
X = copy.copy(x)
# 收的得到第一个w
for epoch in range(300):
w_t = w[:,0:1] # (100,1)
top = np.zeros(shape = (X.shape[0],1)) # (100,1)
for i in range(x.shape[1]):
xi = X[:,i:i+1] # (100,1)
pit = 0
if w_t.T.dot(xi)>=0: # (1,100)@(100,1)
pit = 1
else:
pit = -1
top += (pit * xi)
bottom = np.sqrt(np.sum(top**2))
w[:,0:1] = top/bottom
for j in range(1,out_n):
for i in range(X.shape[1]):
b = w[:,j-1:j]*(w[:,j-1:j].T.dot(X[:,i]))
X[:,i:i+1] = X[:,i:i+1] - b
for epoch in range(300):
w_t = w[:,j:j+1] # (100,1)
top = np.zeros(shape = (X.shape[0],1)) # (100,1)
for i in range(X.shape[1]):
xi = X[:,i:i+1] # (100,1)
pit = 0
if w_t.T.dot(xi)>=0: # (1,100)@(100,1)
pit = 1
else:
pit = -1
top += pit * xi
bottom = np.sqrt(np.sum(top**2))
w[:,j:j+1] = top/bottom
return w.T.dot(x)
class PCA():
def __init__(self,):
pass
def __call__(self,x,out_n = 2): #x (100,16)
Ex = np.mean(x,axis = 0).reshape(-1,4).T
Rx = np.cov(x.T)
eigs,D = linalg.eig(Rx) # val(,10) and vec(10,10)
indices = np.argsort(eigs)
U = D[indices[:- out_n - 1:-1],:] # 5个 (5,10)
Y = U.dot((x.T - Ex)) # (5,1000) 霍特林变换(5,1000)
return Y
if __name__ == '__main__':
dataset_path = "/home/xueaoru/下载/iris.data"
dataset = np.loadtxt(dataset_path,dtype = np.str,delimiter=',')
x = dataset[:,:-1].astype(np.float)
y = dataset[:,-1]
yy = list(set(y))
plt.figure(figsize=(12, 8))
plt.subplot(4,2,1)
plt.subplots_adjust(hspace = 1)
plt.title("attr 0 and attr 1")
for c,target in zip("rgb",yy): # y is the label
plt.scatter(x[y == target,0],x[y== target,1],marker = 'x',c = c,label = target)
plt.subplot(4,2,2)
plt.title("attr 0 and attr 2")
for c,target in zip("rgb",yy): # y is the label
plt.scatter(x[y == target,0],x[y== target,2],marker = 'x',c = c,label = target)
plt.subplot(4,2,3)
plt.title("attr 0 and attr 3")
for c,target in zip("rgb",yy): # y is the label
plt.scatter(x[y == target,0],x[y== target,3],marker = 'x',c = c,label = target)
plt.subplot(4,2,4)
plt.title("attr 1 and attr 2")
for c,target in zip("rgb",yy): # y is the label
plt.scatter(x[y == target,1],x[y== target,2],marker = 'x',c = c,label = target)
plt.subplot(4,2,5)
plt.title("attr 1 and attr 3")
for c,target in zip("rgb",yy): # y is the label
plt.scatter(x[y == target,1],x[y== target,3],marker = 'x',c = c,label = target)
plt.subplot(4,2,6)
plt.title("attr 2 and attr 3")
for c,target in zip("rgb",yy): # y is the label
plt.scatter(x[y == target,2],x[y== target,3],marker = 'x',c = c,label = target)
plt.subplot(4,2,7)
plt.title("Normal PCA")
pca = PCA()
#x = pca(x.T).T
x1 = pca(x).T
#print(x.shape)
for c,target in zip("rgb",yy): # y is the label
plt.scatter(x1[y == target,0],x1[y== target,1],marker = 'x',c = c,label = target)
plt.subplot(4,2,8)
plt.title("L1-PCA")
l1pca = L1PCA()
x2 = l1pca(x.T).T
print(x2)
for c,target in zip("rgb",yy): # y is the label
plt.scatter(x2[y == target,0],x2[y== target,1],marker = 'x',c = c,label = target)
plt.show()
#x = np.random.rand(100,16)
#pca(x)