Generalized Zero- and Few-Shot Learning via Aligned Variational Autoencoders【CVPR2019】

PDF:Generalized Zero- and Few-Shot Learning via Aligned Variational Autoencoders
code:implement by pytorch

摘要

Many approaches in generalized zero-shot learning rely on cross-modal mapping between the image feature space and the class embedding space. As labeled images are expensive, one direction is to augment the dataset by generating either images or image features. However, the former misses fine-grained details and the latter requires learning a mapping associated with class embeddings. In this work,we take feature generation one step further and propose a model where a shared latent space of image features and class embeddings is learned by modality-specific aligned variational autoencoders. This leaves us with the required discriminative information about the image and classes in the latent features, on which we train a softmax classifier.The key to our approach is that we align the distributions learned from images and from side-information to construct latent features that contain the essential multi-modal information associated with unseen classes. We evaluate our learned latent features on several benchmark datasets, i.e.CUB, SUN, AWA1 and AWA2, and establish a new state of the art on generalized zero-shot as well as on few-shot learning. Moreover, our results on ImageNet with various zero-shot splits show that our latent features generalize well in large-scale settings.
本文主要针对零样本学习,网络的输入数据包括:经过res-101处理过的(向量)样例和每一类的attribute(一类数据仅有一个attribute向量)。论文通过使用VAE对上述两种形态的数据进行对齐,最后生成不可见类的latent encoder用于分类任务。

网络结构图


Generalized Zero- and Few-Shot Learning via Aligned Variational Autoencoders【CVPR2019】_第1张图片

标记:
x:经过res101处理后的样例
c:该样例对应的attribute
E:编码器
D:解码器

训练过程:

对于可见类中的样例,每一个样例与对应类的attribute输入到网络中,先编码后解码。最小化的loss包括三部分:
1.VAE部分的损失:假设有M个形态数据


Generalized Zero- and Few-Shot Learning via Aligned Variational Autoencoders【CVPR2019】_第2张图片

2.跨形态对齐:

Generalized Zero- and Few-Shot Learning via Aligned Variational Autoencoders【CVPR2019】_第3张图片

3.分布对齐:


在这里插入图片描述
Generalized Zero- and Few-Shot Learning via Aligned Variational Autoencoders【CVPR2019】_第4张图片

总的loss:

在这里插入图片描述

生成latent encoder(Figure2中的z):

可见类: 输入可见类中的x即可得到z
不可见类: 输入不可见类的attribute即可得到z
代码中会涉及复制样例或者attribute,目的:在再参数化阶段会在分布上进行采样使得丰富z。这样使得不可见类的一个attribute向量会对应多个z。

实验效果


Generalized Zero- and Few-Shot Learning via Aligned Variational Autoencoders【CVPR2019】_第5张图片

你可能感兴趣的:(PyTorch,项目实现,少样本学习,零样本学习)