Zero-Shot Visual Recognition using Semantics-Preserving Adversarial Embedding Networks 阅读笔记

Zero-Shot Visual Recognition using Semantics-Preserving Adversarial Embedding Networks


  • 多个网络如何组合? 相同目标collaborative, 不同目标adversarial


  • 查看groud truth class embedding
  • 查看SAE


  1. using adversarial framework to combine two network: Semantic transfer can be achieved
  2. an indepentdent visual-to-semantic mapping(tackling the semantic loss problem inherently in classification)


  • problem: semantic loss
    • some semantics would be discarded during training
    • they are non-discriminative for traning classes
    • critical for recognizing test classes
  • how

  • 解决了分类网络不重要信息丢失的问题

1. Introduction

1.1 zero shot learning
  • transferring knowledge from seen classes to unseen classes
  • evolution
    • primitive attribute classifier
    • semantic embedding based framework
      Zero-Shot Visual Recognition using Semantics-Preserving Adversarial Embedding Networks 阅读笔记_第1张图片

3. Model

Zero-Shot Visual Recognition using Semantics-Preserving Adversarial Embedding Networks 阅读笔记_第2张图片
- supervised Adeversial Autoencoder
- F encoder, G decoder
- F can be considered as the bottleneck layer, regularized to match supervised E

4. Implementation

4.1 Architecture
  • E: resnet 101
  • F: Alexnet + 2xfc
  • leaky RELU: transform a vector into 3D feature map
  • ground truth class embedding
4.2 Training details
  • per-pixel mean subtraction
  • fixed the Resnet-101 in E, initialized the AlexNet-like blocks in F with AlexNet and G with the pretrained generator
  • learning rate started from 1e −4 and is multiplied by 0.1 when the error
    is plateaus.
  • grid search to select parameter α and β.
