2020 年 5 月 23 日上午,在中国中文信息学会青年工作委员会主办、北京智源人工智能研究院和美团点评承办的“ACL-IJCAI-SIGIR 顶级会议论文报告会(AIS 2020)”中,智源青年科学家、清华大学计算机科学技术系博士生导师、长聘副教授贾珈作了题为《NLP in IJCAI 2020》的报告。
贾珈,智源青年科学家,清华大学计算机科学技术系担任博士生导师、长聘副教授,中国计算机学会语音对话和听觉专委会秘书长,中国中文信息学会语音专业委员会秘书长,主要负责学会青年工作委员会学生委员工作,主要研究方向为情感计算。
IJCAI是人工智能领域的顶级国际学术会议,在演讲中,贾珈基于IJCAI 2020的录用论文内容,按算法层面和任务层面两个维度,从无监督预训练、跨语言学习、元学习和少样本学习、迁移学习、误差、知识融合、问答、自然语言生成、多模态这九个方面介绍了关于自然语言处理的主要成果和研究趋势。
下面是贾珈演讲的精彩要点介绍。
整理:智源社区 罗丽
一、IJCAI 2020词云图中的NLP热点
IJCAI 2020中有80余篇论文和自然语言处理相关,通过对关键词做词云分析,我们可以发现,深度学习在自然语言处理当中仍然占据主导型地位。
除了深度学习之外,词云当中还包含2020年的其他研究热点,主要总结为以下四个方面:
(1)生成类的任务,如对话生成、段落生成。
(2)网络结构设计,在网络结构设计当中研究者们非常喜欢用Attention。
(3)实体关系抽取和实体识别,在今年的IJCAI中,实体关系抽取和实体识别被广泛进行相关的研究。
(4)与神经网络结合设计模型框架,越来越多的研究者注重利用知识和神经网络结合的方式来设计自己的模型框架。
接下来,贾珈主要从两个维度(算法层面,任务层面)、九个方面对IJCAI 2020中NLP相关研究进行了总结。
二、算法层面对NLP的研究总结
1.无监督预训练
预训练语言模型一直是NLP领域的研究热点,它极大地提升了各大NLP任务的性能。
图3是在BERT出现后,有关BERT的一系列通用语言模型。在IJCAI 2020中,也有相关工作聚焦到了语言模型的预训练当中,这些预训练的语言模型中,既包含了通用的预训练模型,如EViLBERT模型[1]、AdaBERT模型[2],也包含了某一个特定任务上的预训练模型,如BERT-INT模型[3]、BERT-PLI模型[4]和FinBERT模型[5]。
EViLBERT模型是通过多模态预训练的语言模型淘汰Image caption(图像描述),取得了较好的效果;AdaBERT模型是利用网络结构搜索进行参数压缩,解决BERT耗时长、参数量大的问题;BERT-INT解决知识图谱的对齐问题;BERT-PLI解决的是法律条文检索的问题;FinBERT解决的是金融文本挖掘的问题。
BERT的出现,已经极大地推动了NLP领域的发展,贾珈推测,NLP围绕BERT的相关研究在未来几年主要体现在这两个方面:
(1)如何加快无监督的语言模型训练过程;
(2)如何通过减少时间开销,寻找更好的网络结构。
2.跨语言学习
近年来,NLP领域对跨语言学习的研究愈发关注,有着很大的现实需求。在 IJCAI 2020中也涉及到如何解决跨语言的问题,它的意义在于一方面可以促进文化交流,另一方面,也更重要的是,它可以极大地促进NLP技术在大量非英语场景下的部署,包括Word-embedding事例、Unsupervised模型、机器翻译等都是相应的热点。
图4为跨语言学习的一个示例,通过学习跨语言的Word embedding,跨语言中有相似意义的词包含了相似的向量。
在无监督的跨语言模型研究中,跨语言模型的预训练是大家关注的一个热点。在IJCAI2020中,UniTrans[6] 研究了无监督的跨语言实体识别方法,也有研究者探究了跨语言模型中无监督Domain adaptation问题[7]。
相比于无监督方法,在跨语言的有关研究中,有监督方法的效果更好,平行语料库在机器翻译等问题上仍至关重要。在IJCAI 2020有监督的跨语言的研究中,有文章探究了用平行语料库生成跨语言复述的问题[8],也就是Bilingual Generation,也有研究用跨语言标注尝试解决语义消息的问题[9]。
此外,机器翻译也是跨语言研究的一个重要方向,在IJCAI2020中,共有七篇机器翻译的相关论文。
3.元学习和少样本学习
近年来,Meta-learning(元学习)和Few shot learning(少样本学习)逐渐成为学术界的研究热点,在IJCAI 2020中,主要探究了两种方式在NLP领域的应用,其中Few shot learning在各种分类任务中应用较为广泛,通过Few shot learning,神经网络可以用很少的样本就泛化到新的类别上;而Meta-learning是实现Few shot learning的重要手段,算法以MAML(Model-Agnostic Meta-Learning)为代表。
在IJCAI 2020中,也有几篇论文探究了Meta learning和Few shot learning在NLP领域的应用,如:QA via Meta-Learning[10]一文中,作者用Meta learning研究了复杂知识库下的问答模型;在Few shot learning的研究中,有研究者探究了Few shot learning在医学+NLP领域的应用[11],通过Few shot learning技术,可以根据病例对疾病进行分类。
4.迁移学习
迁移学习作为机器学习长期以来的研究热点,在IJCAI2020的研究中很火热。在深度学习流行的今天,如何将已经学习到的知识迁移到已有的领域,尤其是如何将大规模无标注的语料中所包含的知识迁移到各个任务上,受到了研究者的广泛关注。
在迁移学习中,最为典型的模式是预训练+Fine tune,这一模式随着BERT的普及越发地得到了NLP研究者的关注。
另一方面,不同于简单的预训练+Fine tune的模式,很多研究者致力于探索更先进的迁移学习框架,在IJCAI2020中,有研究谈及了阅读理解下的知识迁移[12],还有人研究有关文本风格的迁移[13]。
迁移学习除了任务层面上的迁移,还有数据集层面的迁移(Domain adaptation),在IJCAI 2020中,文章《Domain Adaptation for Semantic Parsing》[14]介绍了语法分析的Domain adaptation,这些研究都是对更先进的框架进行探讨,也值得大家更进一步地跟踪和关注。
5.误差
在NLP领域中,由于数据集不均衡的原因,以及各种各样的固有偏见,会出现各种各样的Bias,比如性别上的Bias和种族上的Bias,如果对这些Bias不加处理便会导致不同群体间的歧视。
以图5为例,当我们对Word embedding(词向量)进行可视化时便会发现,有大量单词的Embedding是和性别有相关性的,例如Brilliant, Genius这样的词汇在Embedding中往往和男性更相关,而Dance和Beautiful等这些词汇一般和女性更相关,如何消除这种bias对NLP算法来说至关重要。
在IJCAI 2020中,有数篇论文和NLP当中的Bias 相关,在论文WEFE[15]中,作者提出了一套测试Word embedding是否Fair的框架,还有一篇则是研究者提出了新的测试方法和测试平台,并对NLP模型中的公平性做出了严格的测试[16]。
6.知识融合
尽管大规模的语料在NLP模型中被广泛使用,但目前NLP研究对大规模语料缺乏结构化的认识,特别是对复杂语言的理解能力,所以近些年来很多研究者开始尝试将知识图谱等结构化的知识融合到自然语言处理的框架中,如ACL 2019中的ERNIE框架[17]。
图6:知识融合的示例
图6为ERNIE一文中给出的例子。其中,实线表示现有的知识,红色或绿色的虚线表示从句子中提取的事实,通过结构化知识的融入,对句子中的实体关系抽取可以达到更好地效果。
有很多研究者将聚焦在如何将知识融入到NLP模型中,在IJCAI 2020中,共有10篇相关论文,这10篇文章主要分为两类:
(1)用知识图谱增强原来的NLP任务的性能。其中:有用知识提升阅读理解的效果[18];有用知识提升QA的效果[19];有关于事件的因果性检测[20];有介绍神经机器翻译[21];有研究对话生成[22]。
(2)用知识图谱构建、补全和生成知识。在有关知识图谱的构建与补全工作中,Mucko[23]探究了跨模态的知识推理;BERT-INT[24]探究了知识图谱的对齐;TransOMCS[25]则研究了如何生成常识性知识。这些都是IJCAI 2020 在知识图谱的构建和补全方面比较有代表性的工作。
三、任务层面对NLP的研究总结
1.问答
近年来,有关QA的研究已经从Simple QA逐渐演化发展为Complex QA。Simple QA可以理解为简单的模式匹配,而Complex QA通常需要推理,甚至多跳推理(Multi-hop reasoning)。在IJCAI 2020中,有三篇论文探究了将知识图谱和QA相结合,以实现更复杂的QA,它们分别是,Mucko[26]、Retrieve, Program, Repeat[27]、和《Formal Query Building with Query Structure Prediction for Complex Question Answering over Knowledge Base》[28],而研究LogiQA[29]和《Two-Phase Hypergraph Based Reasoning with Dynamic Relations for Multi-Hop KBQA》[30]谈及了QA中的推理和多跳推理问题,QA通常还会和其它任务结合,形成多任务框架,以提升多个任务的效果。
在IJCAI2020中,有研究将QA和阅读理解,实体关系抽取结合[31],有将QA和文本生成任务相结合[32],这些都是将Multi-task和QA相结合的比较好的模板研究。
2. 自然语言生成
自然语言生成有着广阔的应用前景,也是近年来的研究热点,而在深度学习普及之前,传统的NLG需要内容规划,信息聚合,语法分析等多个步骤,在GAN,VAE等生成模型,以及Sequence2Sequence,Transformer等序列模型出现后,基于深度学习的自然语言生成得到了长足的发展。
在IJCAI 2020中,有大量的工作聚焦在了NLG这一问题上,共有12篇工作研究了生成问题。这些文章分散在不同的任务、目标生成,比如对话生成[33]、复述生成[34]、答复生成[35]以及法律文本生成[36]和评论生成[37],还有不少研究谈及通用的NLG生成框架,将来可以很好地普适应用于各个任务上。由于预训练模型的快速发展,在IJCAI 2020中,出现了将预训练模型和NLG结合进行的研究ERNIE-GEN[38],也有研究结构化的数据生成文本[39]和利用NLG生成预料Matric平衡语料[40],因此,在NLP中,自然语言生成已经具有全方位的研究,这也体现了在NLP领域中,IJCAI会议的受欢迎程度。
3.多模态
多模态(Multi-modality),尤其是将文本和其它的语音、视频、图像的模态相结合,一直以来都是研究的热点话题,也是IJCAI 2020中非常重要的一个研究部分,今年一共有7篇和Multi-modality相关的研究。
视觉问答(Visual Question Answering,VQA)作为研究热点之一,在IJCAI 2020中有4篇论文,分别从知识推理[41]、自监督[42]和网络设计[43]等角度去研究如何通过可视化信息增强QA效果。有研究是通过视频语义推理以达到更好的检索效果[44],还有关于视觉-语音的导航(Navigation)[45],研究通过模型对语言和图像同时进行理解,把语言当中描述的位置以及关键点定位到真实场景图像当中,然后执行相对应的动作,以避免环境造成的bias问题,增加导航鲁棒性。由于BERT的快速发展,IJCAI2020当中,有很多研究和视觉模态结合来做预训练模型,在各项跨模态的任务当中取得了很好的效果。
点击阅读原文,进入智源社区参与更多讨论。
参考文献
[1] Agostina Calabrese, Michele Bevilacqua, Roberto Navigli. EViLBERT: Learning Task-Agnostic Multimodal Sense Embeddings. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp.481-487.
[2] Daoyuan Chen, Yaliang Li, Minghui Qiu, et al. AdaBERT: Task-Adaptive BERT Compression with Differentiable Neural Architecture Search. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 2463-2469.
[3] Xiaobin Tang, Jing Zhang, Bo Chen, et al. BERT-INT: A BERT-based Interaction Model For Knowledge Graph Alignment. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3174-3180.
[4] Yunqiu Shao, Jiaxin Mao, Yiqun Liu, et al. BERT-PLI: Modeling Paragraph-Level Interactions for Legal Case Retrieval. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3501-3507.
[5] Zhuang Liu, Degen Huang, Kaiyu Huang, et al. FinBERT: A Pre-trained Financial Language Representation Model for Financial Text Mining. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Special Track on AI in FinTech. 2021. pp. 4513-4519.
[6] Qianhui Wu, Zijia Lin, Börje F. Karlsson, et al. UniTrans : Unifying Model Transfer and Data Transfer for Cross-Lingual Named Entity Recognition with Unlabeled Data. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3926-3932.
[7] Juntao Li, Ruidan He, Hai Ye, et al. Unsupervised Domain Adaptation of a Pretrained Cross-Lingual Language Model. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3672-3678.
[8] Mingtong Liu, Erguang Yang, Deyi Xiong, et al. Exploring Bilingual Parallel Corpora for Syntactically Controllable Paraphrase Generation. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3955-3961.
[9] Edoardo Barba, Luigi Procopio, Niccolò Campolungo, et al. MuLaN: Multilingual Label propagatioN for Word Sense Disambiguation. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3837-3844.
[10] Yuncheng Hua, Yuan-Fang Li, Gholamreza Haffari, et al. Retrieve, Program, Repeat: Complex Knowledge Base Question Answering via Alternate Meta-learning. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3679-3686.
[11] Congzheng Song, Shanghang Zhang, Najmeh Sadoughi, et al. Generalized Zero-Shot Text Classification for ICD Coding. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 4018-4024.
[12] Xin Liu, Kai Liu, Xiang Li, et al. An Iterative Multi-Source Mutual Knowledge Transfer Framework for Machine Reading Comprehension. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3794-3800.
[13] Xiaoyuan Yi, Zhenghao Liu, Wenhao Li, et al. Text Style Transfer via Learning Style Instance Supported Latent Space. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3801-3807.
[14] Zechang Li, Yuxuan Lai, Yansong Feng, et al. Domain Adaptation for Semantic Parsing. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3723-3729.
[15] Pablo Badilla, Felipe Bravo-Marquez, Jorge Pérez. WEFE: The Word Embeddings Fairness Evaluation Framework. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 430-436.
[16] Pingchuan Ma, Shuai Wang, Jin Liu. Metamorphic Testing and Certified Mitigation of Fairness Violations in NLP Models. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 458-465.
[17] Dongling Xiao, Han Zhang, Yukun Li, et al. ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3997-4003.
[18] Xin Liu, Kai Liu, Xiang Li, et al. An Iterative Multi-Source Mutual Knowledge Transfer Framework for Machine Reading Comprehension. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3794-3800.
[19] Yongrui Chen, Huiying Li, Yuncheng Hua, et al. Formal Query Building with Query Structure Prediction for Complex Question Answering over Knowledge Base. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3751-3758.
[20] Jian Liu, Yubo Chen, Jun Zhao. Knowledge Enhanced Event Causality Identification with Mention Masking Generalizations. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3608-3614.
[21] Yang Zhao, Jiajun Zhang, Yu Zhou, et al. Knowledge Graphs Enhanced Neural Machine Translation. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 4039-4045.
[22] Sixing Wu, Ying Li, Dawei Zhang, et al. TopicKA: Generating Commonsense Knowledge-Aware Dialogue Responses Towards the Recommended Topic Fact. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3766-3772.
[23] Zihao Zhu, Jing Yu, Yujing Wang, et al. Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual Question Answering. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 1097-1103.
[24] Xiaobin Tang, Jing Zhang, Bo Chen, et al. BERT-INT:A BERT-based Interaction Model For Knowledge Graph Alignment. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3174-3180.
[25] Hongming Zhang, Daniel Khashabi, Yangqiu Song, et al. TransOMCS: From Linguistic Graphs to Commonsense Knowledge. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 4004-4010.
[26] Zihao Zhu, Jing Yu, Yujing Wang, et al. Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual Question Answering. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 1097-1103.
[27] Yuncheng Hua, Yuan-Fang Li, Gholamreza Haffari, et al. Retrieve, Program, Repeat: Complex Knowledge Base Question Answering via Alternate Meta-learning. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3679-3686.
[28] Yongrui Chen, Huiying Li, Yuncheng Hua, et al. Formal Query Building with Query Structure Prediction for Complex Question Answering over Knowledge Base. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3751-3758.
[29] Jian Liu, Leyang Cui, Hanmeng Liu, et al. LogiQA: A Challenge Dataset for Machine Reading Comprehension with Logical Reasoning. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3622-3628.
[30] Jiale Han, Bo Cheng, Xu Wang. Two-Phase Hypergraph Based Reasoning with Dynamic Relations for Multi-Hop KBQA. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3615-3621.
[31] Tianyang Zhao, Zhao Yan, Yunbo Cao, et al. Asking Effective and Diverse Questions: A Machine Reading Comprehension based Framework for Joint Entity-Relation Extraction. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3948-3954.
[32] Weijing Huang, Xianfeng Liao, Zhiqiang Xie, et al. Generating Reasonable Legal Text through the Combination of Language Modeling and Question Answering. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3687-3693.
[33] Hengyi Cai, Hongshen Chen, Yonghao Song, et al. Exemplar Guided Neural Dialogue Generation. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3601-3607.
[34] Mingtong Liu, Erguang Yang, Deyi Xiong, et al. Exploring Bilingual Parallel Corpora for Syntactically Controllable Paraphrase Generation. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3955-3961.
[35] Shifeng Li, Shi Feng, Daling Wang, et al. EmoElicitor: An Open Domain Response Generation Model with User Emotional Reaction Awareness. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3637-3643.
[36] Weijing Huang, Xianfeng Liao, Zhiqiang Xie, et al. Generating Reasonable Legal Text through the Combination of Language Modeling and Question Answering. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3687-3693.
[37] Shijie Yang, Liang Li, Shuhui Wang, et al. A Structured Latent Variable Recurrent Network With Stochastic Attention For Generating Weibo Comments. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3962-3968.
[38]Dongling Xiao, Han Zhang, Yukun Li, et al. ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3997-4003.
[39] Yang Bai, Ziran Li, Ning Ding, et al. Infobox-to-text Generation with Tree-like Planning based Attention Network. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3773-3779.
[40] Yimeng Chen, Yanyan Lan, Ruinbin Xiong, et al. Evaluating Natural Language Generation via Unbalanced Optimal Transport. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 3730-3736.
[41] Zihao Zhu, Jing Yu, Yujing Wang, et al. Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual Question Answering. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 1097-1103.
[42] Xi Zhu, Zhendong Mao, Chunxiao Liu, et al. Overcoming Language Priors with Self-supervised Learning for Visual Question Answering. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 1083-1089.
[43] Ganchao Tan, Daqing Liu, Meng Wang, et al. Learning to Discretely Compose Reasoning Module Networks for Video Captioning. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 745-752.
[44] Zerun Feng, Zhimin Zeng, Caili Guo, et al. Exploiting Visual Semantic Reasoning for Video-Text Retrieval. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 1005-1011.
[45] Yubo Zhang, Hao Tan, Mohit Bansal. Diagnosing the Environment Bias in Vision-and-Language Navigation. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Main track. 2021. pp. 890-897.