论文阅读《few-shot image classification with multi-facet prototypes》

Hello~
两个月没更新啦
年都过了 虎年大吉呀大家
把剩下的一点关于小样本学习的论文阅读更新完~
后续就是随缘更新啦
有需要交流可以简信啦

论文名称:
《few-shot image classification with multi-facet prototypes》
论文地址:https://arxiv.org/pdf/2102.00801.pdf

Background

a.The aim of few-shot learning (FSL) is to learn how to recognize image categories from a small number of training examples.
b.A central challenge is that the available training examples are normally insufficient to determine which visual features are most characteristic of the considered categories.
c.Methods only rely on visual features.However, in addition to the training examples, we usually also have access to the names of the image categories to be learned
1.小样本学习的目的是学习如何从少量的训练示例中识别图像类别。
2.一个挑战是:可用的训练例子通常不足以确定哪些视觉特征是最考虑的特征
3.大都方法只依赖于视觉特征。
然而,除了训练示例外,我们通常也可以访问要学习的图像类别的名称。

Work

a.we organise these visual features into facets, which intuitively group features of the same kind (e.g.features that are relevant to shape, color, or texture).
b.assumption that (i) the importance of each facet differs from category to category and (ii) it is possible topredict facet importance from a pre-trained embedding of the category names.
c.In particular, we propose an adaptive similarity measure, relying on predicted facet importance weights for a given set of categories
1.我们将视觉特征分为几个方面,它们直观地将相同类型的特征分组。例如,形状、颜色或纹理相关的特征。
2.本文基于这样一个假设:(i)每个方面的重要性因类别而不同
(ii)可以从预先训练好的嵌入类别名称来预测方面的重要性。
3.最后我们提出了一种自适应相似度度量,它依赖于预测的方面重要性权重。

Method

a.Taking embeddings of class names into account.
b.Different from existing methods, we use these class name embeddings to predict the performance of different facets, and then measure the distance between images and prototypes as a weighted sum of facet-specific distances.
c.The resulting facet-based distance can then be combined with a standard distance, e.g. the Euclidean distance in the case of ProtoNet
1.我们通过类名的嵌入,来提高基于度量的少镜头图像分类方法的性能。
2.与现有的方法不同,我们使用这些类名嵌入来预测不同方面的性能,然后测量图像和原型之间的距离作为特定方面的距离的加权和。然后,由此产生的基于面的距离可以与一个标准距离相结合,例如。原型网络中的欧氏距离。

Facet Identification


给定这些重要性分数,我们可以测量查询图像q和类c的原型之间的距离,作为特定面距离的加权和。

Experiments


在两个标准数据集上进行的实验表明,与最先进的方法相比,我们的模型有了一定的改进。我们还发现,从BERT语言模型中获得的类名嵌入比GloVe向量产生了更好的结果,尽管后者在FSL模型中一直很流行。

其实本文也是有跨模态(nlp)与图像的结合,即将类名嵌入与小样本图像识别相融合。


END

一切顺利呀!

你可能感兴趣的:(论文阅读《few-shot image classification with multi-facet prototypes》)