scikit-learn:4.4. Unsupervised dimensionality reduction(降维)

参考:http://scikit-learn.org/stable/modules/unsupervised_reduction.html


对于高维features,常常需要在supervised之前unsupervised dimensionality reduction。



下面三节的翻译会在之后附上。

4.4.1. PCA: principal component analysis

decomposition.PCA looks for a combination of features that capture well the variance of the original features. See Decomposing signals in components (matrix factorization problems). 翻译文章参考:http://blog.csdn.net/mmc2015/article/details/46867597。

Examples

  • Faces recognition example using eigenfaces and SVMs

4.4.2. Random projections

The module: random_projection provides several toolsfor data reduction by random projections. See the relevant section of the documentation: Random Projection. 翻译文章参考:http://blog.csdn.net/mmc2015/article/details/47067003。

Examples

  • The Johnson-Lindenstrauss bound for embedding with random projections

4.4.3. Feature agglomeration(特征集聚)

cluster.FeatureAgglomeration applies Hierarchical clustering to group together features that behave similarly.

Examples

  • Feature agglomeration vs. univariate selection
  • Feature agglomeration

Feature scaling

Note that if features have very different scaling or statistical properties, cluster.FeatureAgglomeration may not be able to capture the links between related features. Using a preprocessing.StandardScaler can be useful in these settings.



Pipelining:The unsupervised data reduction and the supervised estimator can be chained in one step. See Pipeline: chaining estimators.


你可能感兴趣的:(pca,scikit-learn,特征降维,随机映射,特征集聚)