Boosting the Generalization Capability in Cross-Domain Few-shot Learning via Supervised Autoencoder

原文链接  https://openaccess.thecvf.com/content/ICCV2021/papers/Liang_Boosting_the_Generalization_Capability_in_Cross-Domain_Few-Shot_Learning_via_Noise-Enhanced_ICCV_2021_paper.pdf

Motivation

Boosting the Generalization Capability in Cross-Domain Few-shot Learning via Supervised Autoencoder_第1张图片

Different from general few-shot learning (FSL) where large-scale source dataset and few-shot novel dataset are from the same domain, target dataset and source dataset under Cross-domain few-shot learning (CDFSL) setting come from different domains, i.e. the marginal distributions of features of images in two domains are quite different.


Contributions

- This work is the first work proposes to use supervised autoencoder framework
to boost the model generalization capability under few-shot learning settings

- This work take reconstructed images from autoencoder as noisy inputs and let the model further predict their labels, which proves to further enhance the model generalization capability.

- The two-step fine-tuning procedure that does reconstruction in novel classes better adapts model to the target domain


Methodology (原文图片,侵删)

Problem Formulation 

given: 1) source domain \mathcal{T}_s ; 2)target domain \mathcal{T}_t

there's a domain shift between them, the model is pretrained on \mathcal{T}_s and fine-tunes on \mathcal{T}_t 

Each “N-way K-shot” classification task in target domain contains a support dataset \mathcal{D}_t^s
and a query dataset \mathcal{D}_t^q . The support set contains N classes with K labeled images in each class and the query set contains images from the same N classes with Q unlabeled images in each class

Objective: CDFSL wants to achieve a high classification accuracy on the query set \mathcal{D}_t^q when K is small

Boosting the Generalization Capability in Cross-Domain Few-shot Learning via Supervised Autoencoder_第2张图片

文章基于transfer learning,先利用source domain dataset pretrain一个model, 同时为了使得该model有较强的泛化能力,提出了noise-enhanced SAE (NSAE):NSAE not only
predicts the class labels of the inputs but also predicts the labels of the “noisy” reconstructions.


Pre-train on the source domain

对于来自\mathcal{T}_s 的sample, NSAE训练过程中不仅需要计算reconstruction loss, 还需要考虑reconstruction 出来的images重新输入回classification module得到的loss:

Boosting the Generalization Capability in Cross-Domain Few-shot Learning via Supervised Autoencoder_第3张图片


Fine-tune on the target domain

2 steps 微调:

1) In the first step, 我们在target domain \mathcal{D}_t^s中抽取 support images 微调 autoencoder architecture, the model aims at minimizing reconstruction loss

2) In the second step, only the encoder is used to fine-tune on \mathcal{D}_t^s with the classification loss


Choices of loss functions

文章中pre-train 和 fine-tune使用的是不同的loss function,具体的选择可以参考原文

你可能感兴趣的:(论文笔记,深度学习,迁移学习,神经网络)