Metric learning 度量学习

介绍

定义

  距离测度学习的目的即为了衡量样本之间的相近程度,而这也正是模式识别的核心问题之一。大量的机器学习方法,比如K近邻、支持向量机、径向基函数网络等分类方法以及K-means聚类方法,还有一些基于图的方法,其性能好坏都主要有样本之间的相似度量方法的选择决定。

起源

  Eric Xing在NIPS 2002提出。

优点

  度量学习通常的目标是使同类样本之间的距离尽可能缩小,不同类样本之间的距离尽可能放大。

缺点

  TODO

应用领域

  人脸识别、物体识别、音乐的相似性、人体姿势估计、信息检索、语音识别、手写体识别等领域。

相关

  • 欧式距离 (Euclidean Distance) 与 马氏距离 (Mahalanobis Distance)
  • 图像特征:颜色直方图、GIST、SIFT

解法

Reference 2中可找到《An Overview of Distance Metric Learning》、《Distance Metric Learning: A Comprehensive Survey》。

  • Supervised Distance Metric Learning
Methods Locality Linearity Learning Strategies Code Download
Probablistic Global Distance Metric Learning (PGDM) global linear constrained convex programming by Eric P. Xing
Relevant Components Analysis (RCA) global linear capture global structure; use equivalence constraints by Aharon Bar-Hillel and Tomer Hertz,
Discriminative Component Analysis (DCA) global linear improve RCA by exploring negative constraints by Steven C.H. Hoi
Local Fisher Discriminant Analysis (LFDA) local linear extend LDA by assigning greater weights to closer connecting examples [by Masashi Sugiyama]
Neighborhood Component Analysis (NCA) local linear extend the nearest neighbor classifier toward metric learing [by Charless C. Fowlkes]
Large Margin NN Classifier (LMNN) local linear extend NCA through a maximum margin framework [by Kilian Q. Weinberger]
Localized Distance Metric Learning (LDM) local linear optimize local compactness and local separability in a probabilistic framework [by Liu Yang]
DistBoost global linear learn distance functions by training binary classifiers with margins in a boosting framework by Tomer Hertz and Aharon Bar-Hillel
notes on calling its kernel version
Active Distance Metric Learning (BAYES+VAR) global linear select example pairs with the greatest uncertainty, posterior estimation with a full Bayesian treatment [by Liu Yang]

- Unsupervised Distance Metric Learning

Methods Locality Linearity Learning Strategies Code Download
Principal Component Analysis(PCA) global structure preserved linear best preserve the variance of the data [by Deng Cai]
Multidimensional Scaling(MDS) global structure preserved linear best preserve inter-point distance in low-rank [ included in Matlab Toolbox for Dimensionality Reduction]
ISOMAP global structure preserved nonlinear preserve the geodesic distances [by J. B. Tenenbaum, V. de Silva and J. C. Langford]
Laplacian Eigenamp (LE) local structure preserved nonlinear preserve local neighbor [by Mikhail Belkin]
Locality Preserving Projections (LPP) local structure preserved linear linear approximation to LE [LPP by Deng Cai]
[Kernel LPP by Deng Cai]
Locally Linear Embedding (LLE) local structure preserved nonlinear nonlinear preserve local neighbor [by Sam T. Roweis and Lawrence K. Saul]
Hessian LLE can be found at [MANI fold Learning Matlab Demo, by Todd Wittman]
Neighborhood Preserving Embedding (NPE) lobal structure preserved linear linear approximation to LLE [by Deng Cai]

实现

Python

  • metric-learn
    https://pypi.python.org/pypi/metric-learn/
    • LMNN
      python
      from metric_learn import LMNN
      import numpy as np
      X = np.array([[0., 0., 1.], [0., 0., 2.], [1.,0.,0.], [2.,0.,0.], [2.,2.,2.], [2.,5.,4.]])
      Y = np.array([1, 1, 2, 2, 0, 0])
      lmnn = LMNN(k=2, learn_rate=1e-6)
      lmnn.fit(X, Y, verbose=False)
      Y_c = lmnn.transform(X)

      • output
        text
        >>> Y_c
        array([[ 0. , -0.07987306, 0.11081795],
        [ 0. , -0.15974612, 0.22163591],
        [ 0.07113444, 0. , 0. ],
        [ 0.14226889, 0. , 0. ],
        [ 0.14226889, -0.04460763, 0.06188978],
        [ 0.14226889, -0.03164602, 0.04390651]])

Matlab

  • DistLearnKit
    http://www.cs.cmu.edu/~liuy/distlearn.htm

R

  • Supervised Distance Metric Learning
    https://github.com/road2stat/sdml

应用

  TODO

参考

  1. https://en.wikipedia.org/wiki/Similarity_learning#Metric_learning
  2. http://www.cs.cmu.edu/~liuy/distlearn.htm
  3. http://blog.csdn.net/lzt1983/article/details/7831524

你可能感兴趣的:(算法)