Self-supervised Learning: Generative or Contrastive




【1】Self-supervised Visual Feature Learning with Deep Neural Networks: A Survey



【2】A survey on Semi-, Self- and Unsupervised Techniques in Image Classification



Deep supervised learning has achieved great success in the last decade. However, its deficiencies of dependence on manual labels and vulnerability to attacks have driven people to explore a better solution. As an alternative, self-supervised learning attracts many researchers for its soaring performance on representation learning in the last several years. Self-supervised representation learning leverages input data itself as supervision and benefits almost all types of downstream tasks. In this survey, we take a look into new self-supervised learning methods for representation in computer vision, natural language processing, and graph learning. We comprehensively review the existing empirical methods and summarize them into three main categories according to their objectives: generative, contrastive, and generative-contrastive (adversarial). We further investigate related theoretical analysis work to provide deeper thoughts on how self-supervised learning works. Finally, we briefly discuss open problems and future directions for self-supervised learning. An outline slide for the survey is provided.





提示:现在 self-supervised learning 主要分为两大类:1. Generative Methods;2. Contrastive Methods。

(1)generative methods这类方法主要关注 pixel space 的重建误差,大多以 pixel label 的 loss 为主。主要是以 AutoEncoder 为代表,以及后面的变形,比如 VAE 等等。对编码器的基本要求就是尽可能保留原始数据的重要信息,所以如果能通过 decoder 解码回原始图片,则说明 latent code 重建的足够好了;

(2)Contrastive Methods这类方法并不要求模型能够重建原始输入,而是希望模型能够在特征空间上对不同的输入进行分辨。这类方法有如下的特点:a)在 feature space 上构建距离度量;b)通过特征不变性,可以得到多种预测结果;b) 使用 Siamese Network;d) 不需要 pixel-level 重建。正因为这类方法不用在 pixel-level 上进行重建,所以优化变得更加容易。当然这类方法也不是没有缺点,因为数据中并没有标签,所以主要的问题就是怎么取构造正样本和负样本。这类方法已经取得了很好的结果,在分类任上已经接近监督学习的效果,同时在一些检测、分割的下游任务上甚至超越了监督学习作为 pre-train的方法。

