【多模态论文解读】Align before Fuse: Vision and Language Representation Learning with Momentum Distillation
AlignbeforeFuse:VisionandLanguageRepresentationLearningwithMomentumDistillationName:ALBEFKeywords:Multimodal;ContrastiveLearning;KnowledgeDistillationYear:2021Source:NeurIPSPaper:https://arxiv.org/abs