Generative Adversarial Networks: An Overview笔记

Generative Adversarial Networks: An Overview

作者:Antonia Creswell, Tom White, Vincent Dumoulin, Kai Arulkumaran, Biswa Sengupta, Anil A Bharath

GAN,首先由Ian Goodfellow 提出,随着深度学习的发展,GAN逐渐发展成为了热门的东西,不管是在图像合成,语义图像编辑,风格转换,图像超分辨率和分类等都有着很大的作用。

Abstract

Generative adversarial networks (GANs) provide a way to learn deep representations without extensively annotated training data. They achieve this through deriving backpropagation signals through a competitive process involving a pair of networks. The representations that can belearned by GANs may be used in a variety of applications, including image synthesis, semantic image editing, style transfer, image super-resolution and classification. The aim of this review paper is to provide an overview of GANs for the signal processing community, drawing on familiar analogies and concepts where possible. In addition to identifying different methods for training and constructing GANs, we also point to remaining challenges in their theory and application.
Index Terms—neural networks, unsupervised learning, semi-supervised learning.

Introduction

Generative adversarial networks (GANs) are an emerging technique for both semi-supervised and unsupervised learning.

GAI包括生成器(generator)判别器(discriminator),民间一个通俗的解释是generator是生产假钱的小偷,discriminator是辨别钱的警察,随着警察辨别能力越来越强,生成器的仿真能力也在不断增加,以至于能够产生出像真钱一样(差不多)的假钱。

Preliminaries

A. Terminology

Generative models learn to capture the statistical distribution of training data, allowing us to synthesize samples from the learned distribution.
In all cases, the network weights are learned through backpropagation.

B. Notation

  • latent space is z;
  • pdata(x) as representing the probability density function;
  • pg(x) to denote the distribution of the vectors produced by the generator network of the GAN;
  • symbols G and D to denote the generator and discriminator networks, respectively;
  • JG(ΘG; ΘD) and JD(ΘD; ΘG) to refer to the objective functions of the generator and discriminator, respectively.

C. Capturing Data Distributions

GANs learn through implicitly computing some sort of similarity between the distribution of a candidate model and the distribution corresponding to real data.

-->Fourier-based_and_wavelet_representations 

-->Principal_Components_Analysis_(PCA) 

-->Independent_Components_Analysis_(ICA)

-->noise_contrastive_estimation_(NCE)

-->GANs

GAN architectures

A. Fully Connected GANs

The first GAN architectures used fully connected neural networks for both the generator and discriminator.

B. Convolutional GANs

  • The Laplacian pyramid of adversarial networks (LAPGAN) decompose the generation process using multiple scales.
  • DCGAN (for “deep convolutional GAN”) allows training a pair of deep convolutional generator and discriminator networks.
  • GANs that were able to synthesize 3D data samples using volumetric convolutions.

C. Conditional GANs

A parallel can be drawn between conditional GANs and InfoGAN , which decomposes the noise source into an incompressible source and a “latent code”, attempting to discover latent factors of variation by maximizing the mutual information between the latent code and the generator’s output.

D. GANs with Inference Models

In this formulation, the generator consists of two networks: the “encoder” (inference network) and the “decoder”. They are jointly trained to fool the discriminator.

E. Adversarial Autoencoders (AAE)

Autoencoders are networks, composed of an “encoder” and “decoder”, that learn to map data to an internal latent representation and out again. That is, they learn a deterministic mapping (via the encoder) from a data space into a latent or representation space, and a mapping (via the decoder) from the latent space back to data space.

Training GANS

A. Introduction

Training of GANs involves both finding the parameters of a discriminator that maximize its classification accuracy, and finding the parameters of a generator which maximally confuse the discriminator.

One approach to improving GAN training is to asses the empirical “symptoms” that might be experienced during training. These symptoms include:
- Difficulties in getting the pair of models to converge;
- The generative model, “collapsing”, to generate very similar samples for different inputs;
- The discriminator loss converging quickly to zero, providing no reliable path for gradient updates to the generator.

B. Training Tricks

  • One of the first major improvements in the training of GANs for generating images were the DCGAN architectures.
  • Further heuristic approaches for stabilizing the training of GANs.The first, feature matching; The second, mini-batch discrimination; A third heuristic trick, heuristic averaging; The fourth, virtual batch normalization.
  • Finally, one-sided label smoothing; adding noise to the samples.

C. Alternative formulations

  • Generalisations of the GAN cost function;
  • Alternative Cost functions to prevent vanishing gradients.

D. A Brief Comparison of GAN Variants

Applications of GANs

  • A. Classification and Regression
  • B. Image Synthesis
  • C. Image-to-image translation
  • D. Super-resolution

Discussion

A. Open Questions

  • Mode Collapse;
  • Training instability – saddle points;
  • Evaluating Generative Models.

B. Conclusions

The explosion of interest in GANs is driven not only by their potential to learn deep, highly non-linear mappings from a latent space into a data space and back, but also by their potential to make use of the vast quantities of unlabelled image data that remain closed to deep representation learning.

刘丽
2017-10-24

你可能感兴趣的:(GAN)