[Paper note] Learning from Simulated and Unsupervised Images through Adversarial Training

  • paper
  • This is (probably?) the first paper from Apple in ML/CV field

Contribution

  • Propose Simulate + Unsupervised training, using real world images to refine the synthetic images, with GAN
  • Training GAN using adversarial loss and a self-regularization loss
  • Key modifications to stabilize GAN training and prevent artifacts

Framework

  • Refiner (Generator): x̃ :=Rθ(x)
  • Refiner loss (general formular): R(θ)=ilreal(θ;x̃ i,)+λlreg(θ;x̃ i,xi)
    • lreg minimizing the difference between the synthetic and the refined images
  • Discriminator Loss: D(ϕ)=ilog(Dϕ(x̃ i))jlog(1Dϕ(yj))
    • x~_i, y_j are randomly sampled from refined images and real images sets
  • Algorithm:
    • [Paper note] Learning from Simulated and Unsupervised Images through Adversarial Training_第1张图片
  • L_R loss in the implementation of this paper: R(θ)=ilog(1Dϕ(Rθ(xi)))+λ||Rθ(xi)xi||1

Stabilize GAN training

  • Local adversarial loss: divide the refined image and real image into w x h regions and use separate discriminators to judge each region
    • [Paper note] Learning from Simulated and Unsupervised Images through Adversarial Training_第2张图片
    • Final loss is the sum of loss on each region
  • Using history of refined images
    • Two issues when only use the latest refined images
      • Diverging of adversarial training
      • The refiner network re-introducing the artifacts that the discriminator had forgotten about
    • Buffer history images, use b/2 history images and b/2 newly refined images in each iter
    • Update b/2 of the buffered images in each iter

Experiments

  • Gaze estimation
    • Dataset: MPIIGaze dataset
    • Synthesizer: UnityEyes
    • Visual Turing test: human cannot tell the difference between refined and real images
    • Quantitative result: 22.3% percentage of improvement
  • Hand pose estimation
    • Dataset: NYU hand pose dataset
    • Training CNN: Hour glass network

你可能感兴趣的:(paper-note)