1.Tailor Versatile Multi-Modal Learning for Multi-Label Emotion Recognition
为多标签情感识别量身定制多功能多模式学习
Yi Zhang, Mingyuan Chen, Jundong Shen, Chongjun Wang
2.Sentiment and Emotion-Aware Multi-Modal Complaint Identification
情感和情绪感知的多模式投诉识别
Apoorva Singh, Soumyodeep Dey, Anamitra Singha, Sriparna Saha
3.Are Vision-Language Transformers Learning Multimodal Representations? A Probing Perspective.
Emmanuelle Salin, Badreddine Farah, Stéphane Ayache, Benoit Favre
论文地址:https://hal.archives-ouvertes.fr/hal-03521715/file/11931.SalinE-7.pdf
近年来,由于基于transformer的视觉语言预训练模型的发展,联合文本-图片的embedding得到明显的改善。作者通过一组文本、图像、多模态探究任务在单模态和多模态层次上比较预训练和微调的表征,并且引入了专门用于多模态探测的新数据集。结果证明了视觉语言预训练在多模态层次上理解了颜色的概念,对位置和大小的理解更依赖文本;在语义对抗的例子上,作者发现多模态预训练模型能够准确地指出细微的多模态差异。同时,作者发现模型在多模态任务(VQA、NLVR)上进行fine-tune不一定能提高其多模态表示能力。
4.Multi-Modal Answer Validation for Knowledge-Based VQA
Jialin Wu, Jiasen Lu, Ashish Sabharwal, Roozbeh Mottaghi
论文地址:https://arxiv.org/pdf/2103.12248.pdf
基于知识的视觉问答
5.UniMS: A Unified Framework for Multimodal Summarization with Knowledge Distillation
Zhengkun Zhang, Xiaojun Meng, Yasheng Wang, Xin Jiang, Qun Liu, Zhenglu Yang
论文地址:https://arxiv.org/pdf/2109.05812.pdf
多模态摘要
6.MIA-Former: Efficient and Robust Vision Transformers via Multi-grained Input-Adaptation
MIA-Former:通过多粒度输入适应实现高效、稳健的视觉转换器
Z Yu, Y Fu, S Li, C Li, Y Lin
7.Hierarchical Cross-Modality Semantic Correlation Learning Model for Multimodal Summarization
论文地址:https://arxiv.org/pdf/2112.12072v1.pdf
层次化的跨模态语义关联学习模型(HCSCL)
8.Knowledge Bridging for Empathetic Dialogue Generation
利用外部知识联合交互丰富对话历史,并构建情感上下文图。然后从知识丰富的情感上下文图中学习情感上下文表示并提取情感信号。最后,提出一个情绪交叉注意力机制,从情感上下文图中学习情绪的依赖关系。
9.Hybrid Curriculum Learning for Emotion Recognition in Conversation
面向 ERC 的混合课程学习框架,框架包括两类课程,对话级课程(CC)[难度测量器]和话语级课程(UC)[训练调度器]。
10.CEM: Commonsense-aware Empathetic Response Generation
共情回应生成方法,利用常识获得更多关于用户情况的信息,并利用该额外信息进一步增强共情性在生成回应中表达。
11.OneRel: Joint Entityand Relation Extraction with One Module in One Step
论文地址:https://arxiv.org/abs/2203.05412
将联合提取任务转化为细粒度的三元组分类问题,并提出了一种新的联合提取模型。
12.MuMuQA: Multimedia Multi-Hop News Question Answering via Cross-Media Knowledge Extraction and Grounding
论文地址:https://arxiv.org/pdf/2112.10728.pdf
作者提出了跨模态grounding的新的QA评估基准;涉及多跳问题,需要在图片-文本对之间及逆行推理以确定所指的基础视觉对象,然后从新闻正文中预测出一个跨度来回答问题。此外,作者提出基于多模态知识提取和和问题-答案生成的多模态数据增强网络为这项任务提供弱监督。
13.CLIP-Event: Connecting Text and Images with Event Structures
论文地址:https://arxiv.org/pdf/2201.05078.pdf
视觉语言预训练模型通过理解图片-文本之间的对齐,而本文用一个对比学习的框架来增强视觉语言预训练模型对结构性事件信息的理解,并且收集了事件丰富的图文对用于模型的预训练。
14.Show Your Faith: Cross-Modal Confidence-Aware Network for Image-Text Matching
展示你的信仰。用于图像-文本匹配的跨模式置信度感知网络
Huatian Zhang, Zhendong Mao, Kun Zhang, Yongdong Zhang
15.Event-Image Fusion Stereo Using Cross-Modality Feature Propagation
利用跨模态特征传播的事件图像融合立体技术
Hoonhee Cho, Kuk-Jin Yoon
16.MAGIC: Multimodal relAtional Graph adversarIal inferenCe for Diverse and Unpaired Text-Based Image Captioning
MAGIC:多模态关系图对抗推理,用于多样化和非配对的基于文本的图像字幕
Wenqiao Zhang, Haochen Shi, Jiannan Guo, Shengyu Zhang, Qingpeng Cai, Juncheng Li, Sihui Luo, Yueting Zhuang
17.Hierarchical Cross-Modality Semantic Correlation Learning Model for Multimodal Summarization
用于多模态摘要的分层跨模态语义相关学习模型
Litian Zhang, Junshu Pan, Xiaoming Zhang, Feiran Huang
18.Cross-Modal Mutual Learning for Audio-Visual Speech Recognition and Manipulation
用于视听语音识别和操作的跨模式互学习
Chih-Chun Yang, Wan-Cyuan Fan, Cheng-Fu Yang, Yu-Chiang Frank Wang
19.Cross-Modal Coherence for Text-to-Image Retrieval
文本到图像检索的跨模态连贯性
Malihe Alikhani, Fangda Han, Hareesh Ravi, Mubbasir Kapadia, Vladimir Pavlovic, Matthew Stone
20.D-Vlog: Multimodal Vlog Dataset for Depression Detection
D-Vlog:用于抑郁症检测的多模式 Vlog 数据集
Jeewoo Yoon, Chaewon Kang, Seungbae Kim, Jinyoung Han
21.BROS: A Pre-Trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents
BROS:一种专注于文本和布局的预训练语言模型,用于更好地从文档中提取关键信息
Teakgyu Hong, Donghyun Kim, Mingi Ji, Wonseok Hwang, Daehyun Nam, Sungrae Park
22.Selecting Optimal Context Sentences for Event-Event Relation Extraction
为事件-事件关系提取选择最佳上下文句
Hieu Man Duc Trong, Nghia Ngo Trung, Linh Van Ngo, Thien Huu Nguyen
23.Hyperbolic Disentangled Representation for Fine-Grained Aspect Extraction
用于细粒度方面提取的双曲解缠结表示
Chang-You Tai, Ming-Yao Li, Lun-Wei Ku
24.Language Model Priming for Cross-Lingual Event Extraction
跨语言事件提取的语言模型启动
Steven Fincke, Shantanu Agarwal, Scott Miller, Elizabeth Boschee
25.Content-Variant Reference Image Quality Assessment via Knowledge Distillation
通过知识蒸馏的内容变体参考图像质量评估
Guanghao Yin, Wei Wang, Zehuan Yuan, Chuchu Han, Wei Ji, Shouqian Sun, Changhu Wang
26.Boosting Contrastive Learning with Relation Knowledge Distillation
通过关系知识蒸馏促进对比学习
Kai Zheng, Yuanjiang Wang, Ye Yuan
27.Robust and Resource-Efficient Data-Free Knowledge Distillation by Generative Pseudo Replay
通过生成伪重放实现强大且资源高效的无数据知识蒸馏
Kuluhan Binici, Shivam Aggarwal, Nam Trung Pham, Karianto Leman, Tulika Mitra
28.Cross-Task Knowledge Distillation in Multi-Task Recommendation
多任务推荐中的跨任务知识蒸馏
Chenxiao Yang, Junwei Pan, Xiaofeng Gao, Tingyu Jiang, Dapeng Liu, Guihai Chen
29.Improving Neural Cross-Lingual Abstractive Summarization via Employing Optimal Transport Distance for Knowledge Distillation
通过使用知识蒸馏的最佳传输距离改进神经跨语言抽象摘要
Thong Nguyen, Luu Anh Tuan
30.Knowledge Distillation via Constrained Variational Inference
通过约束变分推理进行知识蒸馏
Ardavan Saeedi, Yuria Utsumi, Li Sun, Kayhan Batmanghelich, Li-wei H. Lehman
31.Up to 100 × \times × Faster Data-Free Knowledge Distillation
高达 100 × \times × 更快的无数据知识蒸馏
Gongfan Fang, Kanya Mo, Xinchao Wang, Jie Song, Shitao Bei, Haofei Zhang, Mingli Song
32.Adversarial Data Augmentation for Task-Specific Knowledge Distillation of Pre-Trained Transformers
用于预训练 Transformers 的特定任务知识蒸馏的对抗性数据增强
Minjia Zhng, Niranjan Uma Naresh, Yuxiong He
33.Unified Named Entity Recognition as Word-Word Relation Classification
统一命名实体识别作为词-词关系分类
Jingye Li, Donghong Ji, Jiang Liu, Hao Fei, Meishan Zhang, Shengqiong Wu, Chong Teng, Fei Li
34.Zero-Shot Cross-Lingual Machine Reading Comprehension via Inter-Sentence Dependency Graph
基于句间依赖图的零样本跨语言机器阅读理解
Liyan Xu, Xuchao Zhang, Bo Zong, Yanchi Liu, Wei Cheng, Jingchao Ni, Haifeng Chen, Liang Zhao, Jinho D. Choi
35.From Good to Best: Two-Stage Training for Cross-Lingual Machine Reading Comprehension
从优秀到最佳:跨语言机器阅读理解的两阶段训练
Nuo Chen, Linjun Shou, Ming Gong, Jian Pei