原文链接 https://openaccess.thecvf.com/content/ICCV2021/papers/Liang_Boosting_the_Generalization_Capability_in_Cross-Domain_Few-Shot_Learning_via_Noise-Enhanced_ICCV_2021_paper.pdf
Different from general few-shot learning (FSL) where large-scale source dataset and few-shot novel dataset are from the same domain, target dataset and source dataset under Cross-domain few-shot learning (CDFSL) setting come from different domains, i.e. the marginal distributions of features of images in two domains are quite different.
- This work is the first work proposes to use supervised autoencoder framework
to boost the model generalization capability under few-shot learning settings
- This work take reconstructed images from autoencoder as noisy inputs and let the model further predict their labels, which proves to further enhance the model generalization capability.
- The two-step fine-tuning procedure that does reconstruction in novel classes better adapts model to the target domain
Problem Formulation
given: 1) source domain ; 2)target domain
there's a domain shift between them, the model is pretrained on and fine-tunes on
Each “N-way K-shot” classification task in target domain contains a support dataset
and a query dataset . The support set contains N classes with K labeled images in each class and the query set contains images from the same N classes with Q unlabeled images in each class
Objective: CDFSL wants to achieve a high classification accuracy on the query set when K is small
文章基于transfer learning,先利用source domain dataset pretrain一个model, 同时为了使得该model有较强的泛化能力,提出了noise-enhanced SAE (NSAE):NSAE not only
predicts the class labels of the inputs but also predicts the labels of the “noisy” reconstructions.
Pre-train on the source domain
对于来自 的sample, NSAE训练过程中不仅需要计算reconstruction loss, 还需要考虑reconstruction 出来的images重新输入回classification module得到的loss:
Fine-tune on the target domain
2 steps 微调:
1) In the first step, 我们在target domain 中抽取 support images 微调 autoencoder architecture, the model aims at minimizing reconstruction loss
2) In the second step, only the encoder is used to fine-tune on with the classification loss
Choices of loss functions
文章中pre-train 和 fine-tune使用的是不同的loss function,具体的选择可以参考原文