Preference Learning——Introduction


preference learning refers to the problem of learning from observations which reveal, either explicitly or implicitly information about the preferences of an individual (e.g., a user of a computer system) or a class of individuals;the acquisition of this kind of information can be supported by methods for preference mining.

Preference learning is about inducing predictive preference models from empirical data.

Special emphasis will be put on learning to rank, which is by now one of the most extensively studied problem tasks in preference learning

  • AI通过处理偏好为知识表示与推理领域的研究提供了定性和象征性的方法以此来弥补传统的方法。这在 economic decision
  • 偏好学习的训练数据通常不会是完整的信息而是一些更一般形式的信息,比如相对偏好信息或者不同种类的间接反馈和隐式偏好信息。

Preference Learning Tasks

  • 偏好学习任务(Preference Learning Tasks):
    • 包括一个偏好已知的一组数据
    • 学到一个函数能预测新数据的偏好


A data object consists of an instance (the input, also called predictive or independent variable in statistics)and an associated class label (the output, also called target or dependent variable in statistics). The former is normally denoted by x, and the corresponding instance space by X , while the output space is denoted by Y.

  • 注意此处的实例instance是一个向量,同时其对应的类标签class label 也是一个向量。区别于在分类问题中我们遇到的类标签是一个变量。


学习排名(learning to rank)是偏好学习(Preference Learning)中最受瞩目的一个主题。简单介绍下几种排名问题:

Label Ranking

  • Input:
    • Given an instance space X
    • Given a finite set of labels Y = {y1,y2…yk}.
  • output:
    • Learn a “label ranker” in the form of an X ->SY mapping,
    • where the output space SY is given by the set of all total orders (permutations) of the set of labels Y.
  • πx(i)函数:将实例x的第i个标签映射到排名中的某个位置
  • πx:表示了实例x的标签排名
  • 下面这个式子表示了x的标签按排名顺序依次排开


  • The label ranking problem is summarized in Fig. 1.
    Preference Learning——Introduction_第1张图片
  • application scenario:

    • the preferential order of a fixed set of products based on demographic properties of a person.
    • the ordering of a set of genes according to their expression level based on features of their phylogenetic profile.
  • Another application scenario is meta-learning
    • the task is to rank learning algorithms according to their suitability for a new dataset,based on the characteristics of this dataset.


  • 最后需要度量label ranker的预测性能,有两种方法:
    • ranking loss:排错的实例对的个数,比如预测的i大于j,但实际是j大于i.
    • position error:真实排名的第一名在预测排名中的位置-1,比如真实排名中A是第一名,但是预测的A是第四名,那么误差是3.

Instance Ranking

  • 实例排名可以理解为实例的有序分类
  • input:
    • An instance x ∈X belongs to one among a finite set of classes Y={y1,y2…yk}.
    • The classes have a natural order: y1 < y2 < … < yk.
  • output:
    • A ranking function f (),the function produces a ranking of these instances as an output (typically by assigning a score to each instance and then sorting by scores).
  • 这里虽然类标签有多个,但是它表示的是类标签可取的范围,此处,每个实例,只取一个值。

举个例子:对提交的paper进行归类{reject, weak reject, weak accept, and accept}.

  • 实例排名问题就转化为多分裂排名或者二分裂排名。

    see Fig. 2 for a formalization of this task

  • 对于这类问题,度量模型的性能时通常计算的是ranking loss,二分裂排名是一个特例,它的ranking loss就等于AUC。

Object Ranking

  • 对象排名对应了机器学习中无监督学习,此时,样本只有特征,没有类标签。也叫“learning to order things” 。
  • input :
    • Given objects Z
  • output :
    • Producing a ranking of these objects(this is
      typically done by assigning a score to each instance and then sorting by scores.)
      Preference Learning——Introduction_第2张图片

Preference Learning Techniques

Learning Utility Functions

Such a function assigns an abstract degree of utility to each alternative under consideration.Depending on the underlying utility scale, which is typically either numerical or ordinal, the problem becomes one of regression learning or ordered classification.

  • In the instance and object preferences scenario:
    • Utility function is a mapping f :X->R that assigns a utility degree f(x) to each instance (object) x and, hence,induces a complete order on X .
  • In the label preferences scenario:

    • Utility function is a mapping fi :X->R is needed for each of the labels yi (i =1 … k ), fi(x) is the utility assigned to alternative yi by instance x.
  • instance ranking情形下,训练数据的实例的效用值已经给出了。因此这个问题在原则上可以用分类或者回归算法来解决,但是与传统分类不同的是,这个分类的目标是最大化排名性能,因此传统的算法需要做适当的调整。

  • object/label ranking情形下,训练数据通常来源于一种间接的监督(indirect supervision),给定的是效用函数的某些限制(一些比较信息,比如某个object(label)应该比另一个object(label)的效用值高)。因此它的目的是找到一个效用函数,使得它满足这些限制。

Learning Preference Relations

The key idea of this approach is to learn a binary preference relation that compares pairs of alternatives (e.g., objects or labels).

  • 优点:
    • 不用将偏好关系(谁比谁大)转化为限制(效用函数需要转化为限制条件),直接将comparative training information用于模型的构建。
  • 缺点:
    • 预测会变得困难,因为二元偏好关系的传递不一定是一致的(A比B大,B比C大,C比A大,此时无法得到正确的排名关系)。
  • 解决方法:
    • 通过找到一个ranking使得这种不一致尽可能的小。
  • 度量标准:
    • the number of object pairs whose ranks are in conflict with their pairwise preference(这个问题是NP难问题,但是simple voting可以近似求解)

Model-Based Preference Learning

Learning ranking functions is to proceed from specific model assumptions, that is, assumptions about the structure of the preference relations.

Assumption Example :the target ranking of a set of objects described in terms of multiple attributes can be represented as a lexicographic order.

  • 这个方法没有前两种方法通用,因为它强烈依赖于具体的设想。

这种设想(Assumption )对假设空间(hypothesis space)是一种归纳偏置限制,可以压缩假设空间的数目,比如说特征数目k=2,特征可取值 m=4,则总的样本数据为2^4=16个,因此,排名假设空间是16!个,但是如果按字典顺序排名的话,只有2^4*4!=384个假设,大大简化了模型的求解。但是这种字典顺序的设想在实际应用中是很少使用的,因为个体的特征值彼此之间是有依赖的,比如说如果主食是肉,那么红酒排在饮料前面,如果主食是鱼,那么白酒将会排在红酒前面。

