稀疏主成分分析(Sparse PCA)概述

Hui Zou et al. 2006年发表在《Journal of computational and graphical statistics》上的文章“Sparse principal component analysis” [1] 首次提出SparsePCA 的概念,截止到目前(2014年4月3日)该文章已经被引用853次(参考谷歌学术搜索数据)[2]。

Literature Sparse principal component analysis”的摘要如下:

Principal component analysis (PCA) is widelyused in data processing and dimensionality reduction. However, PCA suffers fromthe fact that each principal component is a linear combination of all theoriginal variables, thus it is often difficult to interpret the results. We introducea new method called sparse principal component analysis (SPCA) using the lasso (elasticnet) to produce modified principal components with sparse loadings. We firstshow that PCA can be formulated as a regression-type optimization problem;sparse loadings are then obtained by imposing the lasso (elastic net)constraint on the regression coefficients. Efficient algorithms are proposed tofit our SPCA models for both regular multivariate data and gene expressionarrays. We also give a new formula to compute the total variance of modifiedprincipal components. As illustrations, SPCA is applied to real and simulateddata with encouraging results.


Motivation

PCA方法的不足:

       PCA方法的主成分(Principal Component)是原有变量的线性组合,因此很难地接受其结果。


 未完待续。。。

 

[1]Zou,Hui, Trevor Hastie, and Robert Tibshirani."Sparse principal component analysis." Journal of computational and graphical statistics 15.2(2006): 265-286.

[2]http://scholar.google.com/citations?view_op=view_citation&hl=en&user=dJfEfJgAAAAJ&citation_for_view=dJfEfJgAAAAJ:lSLTfruPkqcC

你可能感兴趣的:(机器学习)