Language Generation with Multi-Hop Reasoning on Commonsense Knowledge Graph

1.研究背景

预训练语言模型的知识低效且不系统。尽管语言模型通过在大量语料上预训练隐式地学习到了一定的知识，然而这种获取知识的方式没有显式利用知识库和知识图谱，较为低效
融合外部知识增强模型推理能力的研究仅仅依靠独立的知识三元组。文章认为这样忽略了知识图谱中知识之间的丰富相关性，这些相关性可能为复杂的推理提供多个合理的证据

2.文章目的

基于常识知识图谱的推理，完成语言生成任务。关键词：知识图谱推理

3.文章贡献

提出基于常识知识图谱的多跳推理文本生成模型

GRF模型架构图

use GCN to encode the static graph context to obtain graph-aware representations for the concepts and the relations (GCN基础见4.1、4.2)。即进行知识图谱的编码表示

节点特征更新：

GCN节点特征更新

其中，N(v) denote v's neighborhood which consist of pairs of node u and the connected relation r
(2)式提取了和节点v相接的所有周围边与节点的特征，(3)式利用(2)式得到的周围特征完成v节点特征的更新

结构特征更新：

GCN结构特征更新

比较简单，直接利用线性变换进行更新

注意，文章认为从一个庞大的知识图谱中进行推理太复杂，所以它从输入的语料抽取source concepts得到一个子知识图谱然后再推理。对应原文：The sub-graph consists of inter-connected H-hop paths starting from the source concepts Cx extracted from the input text.

use GPT-2 to model the sequence。GPT-2使用「transformer 解码器模块」构建，一次输出一个token，一般用于实现文本生成
devise a dynamic reasoning module。 this module utilizes both structural patterns of the knowledge graph(来自1) and contextual information (来自2) to propagate evidence along relational paths，即利用知识图谱的信息 + 文本的上下文信息进行推理。推理结果通过节点分数score来确定

首先，初始化知识图谱G各个节点的分值——Initially, nodes correspond to the concepts in Cx are given a score of 1 while other unvisited nodes are assigned with 0.

然后，通过 multi-hop reasoning on the relational paths on G 更新节点分值。对于知识图谱G，利用已经访问过的节点来更新未访问的邻接节点的得分，多跳直到访问完G中所有节点，实现对所有节点的得分更新。对应原文：Specifically, the module broadcasts information on G，by updating the score of outer nodes with their visited neighbours，for multiple hops until all the nodes on G are visited.

ns(v)表示节点v的score，则

更新score公式

最后，对所有节点的score做softmax归一化，得到concepts distribution

final generation distribution with gate control。【知识图谱的concepts distribution】和【直接decode hidden state得到的仅基于输入语料的standard vocabulary distribution】的加权分布作为最终的输出分布，确定next token

4.相关知识基础

4.1 Graph

设图G=(V, E)，则图数据的特征分为

1）节点特征：节点自身的信息——点特征，即V本身的特征
2）结构特征：节点与节点之间的关联特征——边特征，即E本身的特征

4.2 GCN(Graph Convolutional Network)

介绍见https://zhuanlan.zhihu.com/p/37091549

5.实验

5.1三个实验

Story Ending Generation (SEG) ——预测故事结局
Abductive NLG (αNLG) ——预测因果关联
Explanation Generation (EG) ——解释反事实

5.2 Extracting Sub-Graphs as Knowledge Grounding

从input中抽取概念形成子知识图谱，主要步骤如下

选择knowledge base。use 【ConceptNet 】(Speer and Havasi, 2012) as the commonsense knowledge base.
基于字符串匹配的形式提取input sequence中的concept。perform fuzzy matching with the lemmatized form of the surface texts using Spacy3 and filter out stop words，to recognize concepts from the input text sequence base on 【ConceptNet 】
建立子知识图谱（重要）。从当前子图中的节点出发（由来自input sequence的concepts初始化），基于三元组关系搜索每个节点的直接邻居作为潜在节点，不断纳入incoming degree达到要求的节点来扩大子图。直观上看，这样的建图算法保留了概念间最常用的知识联系

emnlp论文阅读笔记 2021-04-12（未允禁转）