文献学习02-Effective Modeling of Encoder-Decoder Architcture for Joint Entity and Relation Extraction

论文信息

（1）题目：Effective Modeling of Encoder-Decoder Architecture for Joint Entity and Relation Extraction （用于联合实体和关系提取的编码器-解码器架构的有效建模）

（2）文章下载地址：https://ojs.aaai.org//index.php/AAAI/article/view/6374

（3）相关代码：https://github.com/nusnlp/PtrNetDecoding4JERE

（4）作者信息：Tapas Nayak, Hwee Tou Ng

Department of Computer Science National University of Singapore

[email protected], [email protected]

Abstract

Introduction

Task Description

Encoder-Decoder Architecture

Embedding Layer & Encoder

Word-level Decoder & Copy Mechanism

Pointer Network-based Decoder

Relation Tuple Extraction

Attention Modeling

Loss Function

Experiments

Datasets

Parameter Settings

Baseline and Evaluation Metrics

Experimental Results

Analysis and Discussion

Ablation Studies

Performance Analysis

Error Analysis

Conclusion

Abstract

A relation tuple consists of two entities and the relation between them, and often such tuples are found in unstructured text. There may be multiple relation tuples present in a text and they may share one or both entities among them. Extracting such relation tuples from a sentence is a difficult task and sharing of entities or overlapping entities among the tuples makes it more challenging. Most prior work adopted a pipeline approach where entities were identified first followed by finding the relations among them, thus missing the interaction among the relation tuples in a sentence. In this paper, we propose two approaches to use encoder-decoder architecture for jointly extracting entities and relations. In the first approach, we propose a representation scheme for relation tuples which enables the decoder to generate one word at a time like machine translation models and still finds all the tuples present in a sentence with full entity names of different length and with overlapping entities. Next, we propose a pointer network-based decoding approach where an entire tuple is generated at every time step. Experiments on the publicly available New York Times corpus show that our proposed approaches outperform previous work and achieve significantly higher F1 scores.

关系元组由两个实体以及它们之间的关系组成，并且经常在非结构化文本中找到这样的元组。在文本中可能存在多个关系元组，并且它们共享一个或两个实体。从一个句子中抽取这样的关系元组是一项困难的工作，并且在共享的实体和重叠的实体之间抽取这样的关系元组更是具有挑战性。以前的大多数工作都是采用管道方法，首先确定实体，然后再找出它们之间的关系，因此在句子中缺少关系元组之间的交互。在本文中，我们提出两种使用encoder-decoder体系结构用于联合抽取实体和关系。

第一种方法，我们提出了一种用于关系元组的表示方案，是的解码器能够向机器翻译模型一样一次生成一个单词，并且仍然能找到一个句子中所有的元组，即使这些实体名长度不同，和重叠实体。第二种方法，我们提出了一种基于指针网络的解码方法，在每一个时间步中生成一个完整的元组。实验是基于公开可获得的纽约时报语料库进行的，并且实验表面了本文提出的方法优于前人的工作，并获得了显著更高的F1分数。

Introduction

Distantly-supervised information extraction systems extract relation tuples with a set of pre-defined relations form text.

参考文献

Traditionally, researchers ((Mintz et al. 2009; Riedel, Yao, and McCallum 2010; Hoffmann et al. 2011) use pipeline approaches where a named entity recognition (NER) system is used to identify the entities in a sentence and the a classifier is used to find the relation (or no relation) between them.However, due to the complete separation of entity detection and relation calssifcation, these models miss the interaction between multiple relation tuples present in a sentence.

参考文献

Recently, several neural network-based models (Katiyar and Cardie 2016; Miwa and Bansal 2016) were proposed to jointly extract entities and relations from a sentence. These models used a parameter-sharing mechanism to extract the entities and relations in the same network. But they still find the relations after identifying all the entities and do not fully capture the interaction among multiple tuples.

参考文献

Zheng et al. (2017) proposed a joint extraction model based on neural sequence tagging scheme. But their model could not extract tuples with overlapping entities in a sentence as it could not assign more than one tag to a word.

参考文献

Zeng et al.(2018) proposed a neural encoder-decoder model for extracting relation tuples with overlapping entities. However, they used a copy mechanism to copy only the last token of the entities, thus this model could not extract the full entity names.

Also , their best performing model used a separate decoder to extract each tuple which limited the power of their model.This model was trained with a fixed number of decoders and could not extract tuples beyond that number during inference.此外，他们使用单独解码器去抽取每个元组的最优性能受限于模型的能力。该模型使用固定数量的解码器进行训练，并且在推理过程中无法提取超过该数量的元组。

Encoder-Decoder models are powerful models and they are successful in many NLP tasks such as machine translation, sentence generation from structured data, and open Information extraction.编码器-解码器模型是一种功能强大的模型，在机器翻译、结构化数据句子生成和开放信息提取等NLP任务中取得了成功。

In this paper, we explore how encoder-decoder models can be used effectively for extracting relation tuples from sentences. There are three major challenges in this task:(1)The model should be able to extract entities and relations together.(2)It should be able to extract multiple tuples with overlapping entities.(3) It Should be able to extract exactly two entities of a tuple with their full names.（i）该模型能够将实体和关系提取在一起。（ii）能够提取具有重叠实体的多个元组。（iii）能够准确地提取一个具有全名的元组实体。

To address these challenges, we propose two novel approaches using encoder-decoder architecture. We first propose a new representation scheme for relation tuples (Table 1) such that it can represnet multiple tuples with overlapping entities and different lengths of entities in a simple way. We employ an encoder-decoder model where the decoder extracts one word at a time like machine translation models.At the end of sequence generation , due to the unique representation of the tuples, we can extract the tuples from the sequence of words.

为了解决这些挑战，提出了两种使用编码器-解码器体系结构的新颖方法。首先提出一种用于关系元组的新表示方案（表1），以便它可以用简单的方式表示具有重叠实体和不同长度实体的多个元组。采用编码器-解码器模型，其中解码器像机器翻译模型一样一次提取一个单词。在序列生成的最后，由于元组的独特表示，可以从单词序列中提取元组。

although this model performs quite well, generating one word at a time is somewhat unnatural for this task. Each tuple has extractly two entities and one relation, and each entity appears as a continuous text span in a sentence.

尽管此模型执行得很好，但是一次生成一个单词对于此任务来说有点不自然。每个元组恰好具有两个实体和一个关系，并且每个实体在句子中显示为连续的文本范围。

The most effective way to identify them is to find their start and end location in the sentence.Each relation tuple can then be represented using five items:start and end location of the two entities and the relation between them (see Table 1).

Keeping this in mind, we propose a pointer network-based decoding framework.基于这一点我们提出了一种基于指针网络的解码框架。This decoder consists of two pointer networks which find the start and end location of the two entities in a sentence, and a classification network which identifies the relation between them.这个解码器由两个指针网络和一个分类网络组成，两个指针网络能够从一个句子中发现两个实体的起始和结束位置，分类网络则能够识别两个实体关系。At every time step of the decoding, this decoder extracts an entire relation tuple, not just a word.Experiments on the New York Times(NYT) datasets show that our approaches work effectively for this task and achieve state-of-the-art performance. To summarize, the contributions of this paper are as follows:(1)We propose a new representation scheme for relation tuples such that an encoder-decoder model, which extracts one word at each time step, can still find multiple tuples with overlapping entities and tuples with multi-token entities form sentences.在解码的每一步，该解码器都提取一个完整的关系元组，而不仅仅是一个单词。在纽约时报（NYT）数据集上的实验表明，我们的方法有效地完成了这项任务，并实现了最先进的性能。综上所述，本文的贡献如下：（1）我们提出了一种新的关系元组表示方法，使得在每个时间步提取一个单词的编码器-解码器模型仍然可以找到多个实体重叠的元组，并且多标记实体的元组形成句子.We also propose a masking-based copy mechanism to extract the entities from the source sentence only.

(2)We propose a modification in the decoding framework with pointer networks to make the encoder-decoder model more suitable for this task. At every time step, this decoder extracts an entire relation tuple, not just a word. This new decoding framework helps in speeding up the training process and uses less resources (GPU memory).This will be an important factor when we move from sentence-level tuple extraction to document-level extraction.我们提出了一个修改的解码框架与指针网络，使编码器-解码器模型更适合这项任务。每一步，这个解码器提取一个完整的关系元组，而不仅仅是一个单词。这种新的解码框架有助于加快训练过程，并使用更少的资源（GPU内存）。这将是我们从句子级元组提取转移到文档级提取时的一个重要因素。

（3）Experiments on the NYT datasets show that our approaches outperform all the previous state-of-the-art models significantly and set a new benchmark on these datasets.在NYT数据集上的实验表明，我们的方法显著优于所有以前的最新模型，并在这些数据集上建立了一个新的基准。

Task Description

A relation tuple consists of two entities and a relation. Such tuples can be found in sentences where an entity is a text span in a sentence and a relation comes from a pre-defined set R.These tuples may share one or both entities among them. Based on this, we divide the sentences into three classes:(1)No Entity Overlap(NEO)：A sentence in this class has one or more tuples, but they do not share any entities.(2)Entity pair overlap(EPO) : A sentence in this class has more than one tuple, and at least two tuples share both the entities in the same or reverse order.(3)Single Entity Overlap(SEO): A sentence in this class has more than one tuple and at least two tuples share exactly one entity. It should be noted that a sentence can belong to both EPO and SEO classes. Our task is to extract all relation tuples present in a sentence.

关系元组由两个实体和一个关系组成。这样的元组可以在句子中找到，其中实体是句子中的文本范围，并且关系来自预定义的集合R。这些元组可以在其中共享一个或两个实体。基于这些，我们将句子分为三类：

（1）无实体重叠（NEO）：此类中一个句子包含一个或多个元组，但它们不共享任何实体。
（2）实体对重叠（EPO）：此类中一个句子有多个元组，并且至少两个元组以相同或相反的顺序共享两个实体。
（3）单实体重叠（SEO）：此类中一个句子包含一个以上的元组，并且至少两个元组正好共享一个实体。
一个句子可以同时属于EPO和SEO类，任务是提取句子中存在的所有关系元组。我们的任务是提取句子中的所有关系元组。

Encoder-Decoder Architecture

In this task, input to the system is a sequence of words, and output is a set of relation tuples. In our first approach, we represent each tuple as entity1; entity2; relation. We use ';'as a separator token to separate the tuple components. Multiple tuples are separated using the '|' token. We have included one example of such representation in Table 1. Multiple relation tuples with overlapping entities and different lengths of entities can be represented in a simple way using these special tokens (; and | ).此任务中输入是单词序列，输出是一组关系元组。第一种方法中，表示每个元组的实体1；实体2;关系，使用“；”作为分隔符来分隔元组各部分，多行元组使用“ |”分隔。使用这些特殊标记，可以用一种简单的方式表示具有重叠实体和不同长度实体的多个关系元组。

During inference, after the end of sequence generation , relation tuples can be extracted easily using these special tokens. Due to this uniform representation scheme, where entity tokens, relation tokens, and special tokens are treated similarly, we use a shared vocabulary between the encoder and decoder which includes all of these tokens.The input sentence contains clue words for every relation which can help generate the relation tokens.We use two special tokens so that the model can distinguish between the beginning of a relation tuple and the beginning of a tuple component.To extract the relation tuples from a sentence suing the encoder-decoder model, the model has to generate the entity tokens, find relation clue words and map them to the relation tokens, and generate the special tokens at appropriate time.Our experiments show that the encoder-decoder models can achieve this quite effectively.在推理过程中，序列生成结束后，可以使用这些特殊标记轻松提取关系元组。由于采用了这种统一的表示方案，对实体，关系和特殊标记的处理类似，因此编码器和解码器之间使用了包含所有这些标记的共享词汇。输入句子包含每个关系的线索词，可以帮助生成关系标记。其次使用两个特殊标记，以便模型可以区分关系元组的开头和元组组件的开头。为了使用编码器-解码器模型从句子中提取关系元组，该模型必须生成实体标记，找到关系线索词并将其映射到关系标记，并在适当的时间生成特殊标记。我们的实验显示E-D模型非常有效的实现这些。

Embedding Layer & Encoder

（1）关系集合R；

（2）特殊分割符标记：“；”，“|”

（3）start-of-target-sequence (SOS), end-of-target-sequence(EOS),unkonwn work token(UNK)

（4）词级嵌入由两部分组成：①预训练词向量；②基于字符嵌入的特征向量；

（5）词嵌入层 $E_{w} \in \mathbb{R}^{|V| * d_{w}}$ ，字符级嵌入 $E_{c} \in \mathbb{R}^{|A| * d_{c}}$

其中， $d_{w}$ 是词向量的维数；A是输入句子标记的字母字符， $d_{c}$ 是字符嵌入向量的维数；

（6）使用具有最大池化的卷积神经网络为每个单词提取维度为 $d_{f}$ 的特征向量。

（7）Word embeddings and character embedding-based feature vectors are concatenated(||) to obtain the representation of the input tokens. (此处的“||”理解为连接，有两个向量连接成为一个向量)

（8）S 被表示成：X1，X2，... ... ，Xn, 其中 $X_{i}\in \mathbb{R}^{({d_{w}}+{d_{f}}))}$ 表示的是第i个词，n表示S的长度。

（9）Xi被传递到Bi-LSTM中，获取隐藏层的表示 $h_{i}^{E}$ ，Bi-LSTM的前向和后向LSTM的隐藏维度设置为 $\frac{d_{h}}{2}$ ， $h_{i}^{E} \in \mathbb{R}^{d_{h}}$ ;

（10） $d_{h}$ 是后面描述的解码器的序列生成器LSTM的隐藏维度。

Word-level Decoder & Copy Mechanism

（1）目标序列T被表示成：，m是目标序列的长度。

（1）目标序列T仅由标记y0; y1; ::::; ym的词嵌入向量表示，其中是第i个标记的嵌入向量，m是目标序列的长度。

（2）y0和ym分别代表SOS和EOS标记的嵌入向量。解码器一次生成一个标记，并在生成EOS时停止。

（3）使用LSTM作为解码器，在t步时间，解码器将源语句编码（）和先前的目标词嵌入（）作为输入，并生成当前标记的隐藏表示（）。

（4）使用注意力机制获得句子编码向量，使用具有权重矩阵和偏差矢量的线性层将投影到词汇表V。

（5） $\mathbf{O}_{t}$ 表示词汇表中所有单词在t步时间上的标准得分。 $h_{t-1}^{D}$ 表示LSTM的前隐藏状态。

The projection layer of the decoder maps the decoder output to the entire vocabulary. During training, we use the gold label target tokens directly. However, during inference, the decoder may predict a token from the vocabullary which is not present in the current sentence or the set of relations or the special tokens. 解码器的映射层映射解码器的输出到整个词汇表上，在训练过程中用gold label直接标记目标记号。然而在推断中，解码器可以从词汇表中预测在当前句子或关系集或特殊标记中不存在的标记。

To prevent this, we use a masking technique (屏蔽技术)while applying the softmax operation at the projection layer.We mask (exclude) all words of the vocabulary except the current source sentence tokens, relation tokens, separator tokens(';', '|') ,UNK, and EOS tokens in the softmax operation.To mask (exculde) some word from softmax, we set the corresponding value in $\hat{O}_{t}$ at $-\propto$ and the corresponding softmax score will be zero.（相对应的softmax分数将为0）This ensures the copying of entities from the source sentence only.（确保从原子句中复制实体）

We include the UNK token in the softmax operation to make sure that the model generates new entities during inference. If the decoder predicts an UNK token, we replace it with the corresponding source word which has the highest attention score.（如果解码器预测的是UNK符号，则将其替换为具有最高注意得分的原词）。

During inference, after decoding is finished, we extract all tuples based on the special tokens, remove duplicate tuples and tuples in which both entities are the same or tuples where the relation token is not from the relation set. This model is referred to as WordDecoding (WDec) henceforth.（在推理过程中，解码器完成了解码后，我们抽取了基于特殊标记提取的元组，移除掉复制的元组、实体相同的元组，关系标记不在关系集合中的元组）

Pointer Network-based Decoder

(1)在第二中方法中，使用实体在句子中的开始和结束位置来识别实体。

（2）从单词词汇表中删除特殊标记和关系名称，并且单词嵌入仅在编码器侧与字符嵌入一起使用。在模型的解码器端使用附加的关系嵌入矩阵；

（3）y0是一个虚拟元组，表示序列的起始元组。而ym充当序列的结束元组，该序列具有EOS作为关系（该元组将忽略实体）。

（1）解码器由一个LSTM（具有隐藏维dh来生成元组序列）、两个指针网络（用于查找两个实体）以及分类网络（用于查找元组的关系）组成。

（2）在时间步骤t，解码器将源语句编码和所有先前生成的元组的表示 $y_{prev}$ 作为输入，并生成当前元组的隐藏表示。（当前隐藏层计算的输入是原语句et和前一层元组的表示yprev做为输入）

(3)句子编码向量et是通过使用Attention mechanism获得的；

（4）关系元组是一个集合，为了防止解码器再次生成相同的元组，在解码的每个时间步传递有关所有的元组的信息。

（5）yj表示的是时间步长j是时间t-1处LSTM的隐藏状态。

Relation Tuple Extraction

在获得当前元组 $h_{t}^{D}$ 的隐藏表示之后，首先在源语句中找到两个实体的开始和结束指针。将向量 $h_{t}^{D}$ 与编码器的隐藏向量连接起来，然后将它们传递给具有dp隐藏维度的Bi-LSTM层，以实现向前和向后LSTM。该Bi-LSTM层的隐藏向量被传递到具有softmax的两个前馈网络(FFN)，以将每个隐藏向量转换为介于0和1之间的两个标量值，Softmax操作应用于输入句子中的所有单词。这两个标量值表示相应的源句子标记成为第一个实体的开始和结束位置的概率。带有两个前向层的Bi LSTM层是标识当前关系元组的第一个实体的第一个指针网络。

表示的是标准化概率，是第i个源词的预测元组的第一个实体的开始和结束的概率。然后使用第二个指针网络来提取第二个元组，将这三个向量连接起来，传输到第二个指针网络中获得第i个源词的第二个实体的开始和结束的概率。利用标准化后的概率来计算两个实体的向量，表示如下：

将三者进行连接，并乘以权重矩阵，利用一个偏置向量br修正。

（1） $r_{t}$ 表示时间步 t 时的预测关系的标准化概率，是argmax 函数、 $r_{t}$ 和Er 计算得到， $y_{t}$ 是时间步 t 预测元组的的向量表示。

（2）在训练过程中，我们通过金标关系（？）的嵌入向量代替预测关系。因此，argmax函数不会影响训练期间的反向传播。

（3）当预测关系为EOS是解码器停止序列生成过程，这就是解码器的分类网络。

（1）在推理过程中，选择两个实体的开始和结束的位置，求四个指针概率乘积最大值，同时保持两个实体不相互重叠，且1<=b<=e<=n的约束条件，其中b和e是对应实体的开始和结束的位置。

（2）首先选择基于对应的开始和结束的指针概率的乘积求实体1的起点和终点。然后类似的方法找到实体2，排除实体1的span，避免重叠。

（3）重复相同的过程，但是，先找到实体2，在找实体1。

（4）选择该对实体，在两个选择之间给出了四个指针概率的较高乘积。此后，这个模型被称为PtrNetDecoding(PNDec)（土话解释：选项1：先找实体1，再找实体2，求四个概率的乘积；选项2：先找实体2，再找实体1，求四个概率的乘积。从这两个选项中找一个概率乘积较高的作为最后的选择模型条件。）

Attention Modeling

We experimented with three different attention mechanisms for our word-level decoding model to obtain the source context vector $e_{t}$ :

（1）Avg：上下文向量是通过平均编码的隐藏向量来获得的：

（2）N-gram：上下文向量是通过N-gram注意机制来获得的，其中N=3

（3）Single：上下文向量的获得是通过Bahdanau,Cho and Bengio(2015) 提出的注意机制。这种注意机制给出了词级解码模型最优的性能。

（*）本文的注意模型（Attention Modeling）工作：

对于基于指针网络的解码模型使用single注意模型的三个变体。首先，使用 $h_{t-1}^{D}$ 在注意力机制中计算 $q_{t}^{i}$ ，其次，使用 $y_{prev}$ 计算 $q_{t}^{i}$ ；最后，用 $h_{t-1}^{D}$ 和 $y_{prev}$ 获得的两个注意向量进行级联来获得注意上下文向量。通过基于指针网络的解码模型，可以提供最佳性能。

Loss Function

We minimize the negative log-likelihood loss of the generated words for word-level decoding (Lword) and minimize the sum of negative log-likelihood loss of relation classification and the four pointer locations for pointer network-based decoding (Lptr).

(1) $v_{t}^{b}$ 是单词级解码中时间步t处目标单词的softmax得分;

(2) r/s/e 是实体相应的真实关系标签、真实start和end指针位置的softmax得分;

(3) b/t/r 是第b个训练实例、解码时间步t和一个元组的两个实体;

(4)B/T分别是batch size和解码器的最大时间步。

Experiments

Datasets

对于实验数据集的选择，选取的是NYT的语料库。该数据集有多个版本，本文选取了两个该数据集的两个版本分别如下：

（1）Zeng et al(2018), 命名为NYT24。里面包含了24个关系；

（2）Takanobu et al (2019)，密码为NYT29。里面包含了29个关系。

本文选择10%的原始训练数据作为验证数据集，剩余90%用于训练，表2给出了训练和测试的数据集的统计数据。

Parameter Settings

相关参数说明如上标记。为了防止过拟合，设置dropout reate为0.3

Baseline and Evaluation Metrics

本文比较了如下最新的实体和关系联合抽取的模型：

（1）SPTree；参考文献：（Miwa and Bansal 2016）:This is an end-to-end neural entity and relaiton extraction model using sequence LSTM and Tree LSTM. Sequence LSTM is used to identify all the entities first and then Tree LSTM is used to find the relation between all pairs of entites.

这是一个使用序列LSTM和树LSTM的端到端神经实体和关系提取模型。首先使用序列LSTM识别所有实体，然后使用树LSTM查找所有实体对之间的关系。

（2）Tagging；参考文献：（Zheng et al. 2017）:This is a neural sequence tagging model which jointly extract the entities and relaitons using an LSTM encoder and an LSTM decoder. They used a Cartesian product (笛卡尔积) of entity tags and relation tags to encoder the entity and relation information together. This model does not work when tuples have overlapping entities.

这是一个神经序列标记模型，使用LSTM编码器和LSTM解码器联合提取实体和关系。他们使用笛卡尔积用于将实体和关系信息编码在一起的实体标记和关系标记。当元组具有重叠实体时，此模型不起作用。

（3）CopyR；参考文献：（Zeng et al. 2018）:This model uses an encoder-decoder approach for joint extraction of entities and relations. It copies only the last token of an entity from the source sentence. Their best performing multi-decoder model is trained with a fixed number of decoders where each decoder extracts one tuple.

该模型使用编码器-解码器方法联合提取实体和关系。它只从源语句复制实体的最后一个标记。最佳性能多解码器模型使用固定数量的解码器进行训练，其中每个解码器提取一个元组。

（4）HRL；参考文献：（Takanobu et al. 2019）:This model uses a reinforcement learning(RL) algorithm with two levels of hierarchy for tuple extraction. A high-level RL finds the relation and a low-level RL identities the two entities using a sequence tagging approach. This sequence tagging approach cannot always ensure extraction of exactly two entities.

该模型采用两层结构的强化学习（RL）算法进行元组提取。高级RL查找关系，低级RL使用序列标记方法标识两个实体。这种序列标记方法不能始终确保只提取两个实体。

（5）GraphR；参考文献：（Fu, Li, and Ma 2019）:This model considers each token in a sentence as a node in a graph, and edges connecting the nodes as relations between them. They use graph convolution network (GCN) to predict the relations of every edge and then filter out some of the relaitons.

该模型将句子中的每个标记视为图中的一个节点，将连接节点的边视为它们之间的关系。使用图卷积网络（GCN）预测每条边的关系，然后过滤掉一些关系。

（6）N-gram Attention；参考文献：(Trisedya et al. 2019)：This model uses an encoder-decoder approach with N-gram attention mechanism for knowledge-base completion using distantly supervised data. The encoder uses the entire Wikidata (参考文献) entity IDs and relation IDs as its vocabulary. The encoder takes the source sentence as input and the decoder outputs the two entity IDs and relation ID for every tuple.该模型采用了一种具有N-gram注意机制的编码-解码方法，用于使用远程监督数据完成知识库。编码器使用整个Wikidata实体ID和关系ID作为其词汇表。编码器将源语句作为输入，解码器输出每个元组的两个实体ID和关系ID。

During training, it uses the mapping of entity names and wikidata IDs of the entire wikidata for proper alignment. Our task of extracting relation tuples with the raw entity names from a sentence is more challenging since entity names are not of fixed length.在训练期间，它使用整个wikidata的实体名称和wikidata ID的映射进行适当的对齐。我们从句子中提取具有原始实体名称的关系元组的任务更具挑战性，因为实体名称不是固定长度的。

Our more generic approach is also helpful for extracting new entities which are not present in the existing knowledge bases such as Wikidata. We use their N-gram attention mechanism in our model to compare its performance with other attention models (Table 4).我们更通用的方法也有助于提取现有知识库（如Wikidata）中不存在的新实体。在我们的模型中，使用他们的N-gram注意机制来比较其与其他注意模型的性能（表4）。

We use the same evaluation method used by Takanobu et al. (2019) in their experiments. We consider the extracted tuples as a set and remove the duplicate tuples. An extracted tuple is considered as correct if the corresponding full entity names are correct and the relation is also correct. We report precision, recall, and F1 score for comparison.

Experimental Results

Analysis and Discussion

Ablation Studies

We include the performance of different attention mechanisms with our WordDecoding model, effects of our masking-based copy mechanism, and ablation results of three variants of the single attention mechanism with our PtrNetDecoding model in Table4.

我们在表4中包括了不同注意机制，包括我们的WordDecoding model的表现，基于屏蔽的复制机制的影响，以及单注意机制的三种变体在我们的PtrNetDecoding模型中的消融结果。

WordDecoding with single attention achieves the highest F1 score on both datasets. We also see that our copy mechanism improves F1 scores by around 4-7% in each attention mechanism with both datasets.

PtrNetDecoding achieves the highest F1 scores when we combine the two attention mechanisms with respect to the previous hidden vector of the decoder LSTM () and representation of all previously extracted tuples ().

在两个数据集上，单注意力文字解码的F1得分最高。我们还发现，我们的复制机制在两个数据集的每个注意机制中提高F1分数约4-7%。

当我们结合两种注意机制时，PtrNetDecoding可以获得最高的F1分数，这两种注意机制涉及解码器LSTM的先前隐藏向量和所有先前提取的元组的表示。

Performance Analysis

From Table 3, we see that CopyR, HRL, and our models achieve significantly higher F1 scores on the NYT24 dataset than the NYT29 dataset. Both datasets have a similar set of relations and similar texts (NYT). So task-wise both datasets should pose a similar challenge. However, the F1 scores suggest that the NYT24 dataset is easier than NYT29. The reason is that NYT24 has around 72.0% of overlapping tuples between the training and test data (% of test tuples that appear in the training data with different source sentences). In contrast, NYT29 has only 41.7% of overlapping tuples. Due to the memorization power of deep neural networks, it can achieve much higher F1 score on NYT24. The difference between the F1 scores of WordDecoding and PtrNetDecoding on NYT24 is marginally higher than NYT29, since WordDecoding has more trainable parameters (about 27 million) than PtrNetDecoding (about 24.5 million) and NYT24 has very high tuple overlap. However, their ensemble versions achieve closer F1 scores on both datasets.

从表3中，可以看到，CopyR、HRL和我们的模型在NYT24数据集上的F1得分显著高于NYT29数据集。两个数据集都有一组相似的关系和文本（NYT）。因此，就任务而言，这两个数据集应该会带来类似的挑战。然而，F1分数表明NYT24数据集比NYT29更容易。原因是NYT24在训练数据和测试数据之间有大约72.0%的重叠元组（占训练数据中不同源语句出现的测试元组的百分比）。相比之下，NYT29只有41.7%的重叠元组。由于深层神经网络的记忆能力，它可以在NYT24上获得更高的F1分数。NYT24上WordDecoding和PtrNetDecoding的F1分数之间的差异略高于NYT29，因为WordDecoding的可训练参数（约2700万）比PtrEntDecoding（约2450万）多，NYT24的元组重叠非常高。然而，他们的集成版本在两个数据集上都获得了更接近F1的分数。

Despite achieving marginally lower F1 scores, the pointer network-based model can be considered more intuitive and suitable for this task. WordDecoding may not extract the special tokens and relation tokens at the right time steps, which is critical for finding the tuples from the generated sequence of words. PtrNetDecoding always extracts two entities of varying length and a relation for every tuple. We also observe that PtrNetDecoding is more than two times faster and takes one-third of the GPU memory of WordDecoding during training and inference. This speedup and smaller memory consumption are achieved due to the fewer number of decoding steps of PtrNetDecoding compared to WordDecoding. PtrNetDecoding extracts an entire tuple at each time step, whereas WordDecoding extracts just one word at each time step and so requires eight time steps on average to extract a tuple (assuming that the average length of an entity is two). The softmax operation at the projection layer of WordDecoding is applied across the entire vocabulary and the vocabulary size can be large (more than 40,000 for our datasets). In case of PtrNetDecoding, the softmax operation is applied across the sentence length (maximum of 100 in our experiments) and across the relation set (24 and 29 for our datasets). The costly softmax operation and the higher number of decoding time steps significantly increase the training and inference time for WordDecoding. The encoderdecoder model proposed by Trisedya et al. (2019) faces a similar softmax-related problem as their target vocabulary contains the entire Wikidata entity IDs and relation IDs which is in the millions. HRL, which uses a deep reinforcement learning algorithm, takes around 8x more time to train than PtrNetDecoding with a similar GPU configuration. The speedup and smaller memory consumption will be useful when we move from sentence-level extraction to documentlevel extraction, since document length is much higher than sentence length and a document contains a higher number of tuples.

尽管F1成绩略低，但基于指针网络的模型更直观，更适合此任务。WordDecoding可能无法在正确的时间步提取特殊标记和关系标记，这对于从生成的单词序列中查找元组至关重要。PtrNetDecoding总是为每个元组提取两个不同长度的实体和一个关系。我们还观察到，在训练和推理期间，PtrNetDecoding的速度比GPU快两倍多，占用了GPU字解码内存的三分之一。与字解码相比，PtrNetDecoding的解码步骤更少，因此实现了这种加速和更小的内存消耗。PtrNetDecoding在每个时间步提取整个元组，而WordDecoding在每个时间步仅提取一个单词，因此提取元组平均需要八个时间步（假设实体的平均长度为两个）。WordDecoding投影层的softmax操作应用于整个词汇表，词汇表的大小可能很大（我们的数据集超过40000个）。在PtrNetDecoding的情况下，softmax操作应用于整个句子长度（在我们的实验中最大为100）和整个关系集（对于我们的数据集为24和29）。昂贵的softmax操作和较高的解码时间步数显著增加了字解码的训练和推理时间。Trisedya et al.（2019）提出的编码器-解码器模型面临类似的softmax相关问题，因为其目标词汇表包含数百万的整个Wikidata实体ID和关系ID。HRL使用深度强化学习算法，训练时间比使用类似GPU配置的PtrNetDecoding多8倍左右。当我们从句子级提取转移到文档级提取时，加速和较小的内存消耗将非常有用，因为文档长度远高于句子长度，并且文档包含更多的元组。

Error Analysis

The relation tuples extracted by a joint model can be erroneous for multiple reasons such as: (i) extracted entities are wrong; (ii) extracted relations are wrong; (iii) pairings of entities with relations are wrong. To see the effects of the first two reasons, we analyze the performance of HRL and our models on entity generation and relation generation separately. For entity generation, we only consider those entities which are part of some tuple. For relation generation, we only consider the relations of the tuples. We include the performance of our two models and HRL on entity generation and relation generation in Table 5. Our proposed models perform better than HRL on both tasks. Comparing our two models, PtrNetDecoding performs better than WordDecoding on both tasks, although WordDecoding achieves higher F1 scores in tuple extraction. This suggests that PtrNetDecoding makes more errors while pairing the entities with relations. We further analyze the outputs of our models and HRL to determine the errors due to ordering of entities (Order), mismatch of the first entity (Ent1), and mismatch of the second entity (Ent2) in Table 6. WordDecoding generates fewer errors than the other two models in all the categories and thus achieves the highest F1 scores on both datasets.

由联合模型提取的关系元组可能由于多种原因而错误，例如：（i）提取的实体错误；（ii）提取的关系是错误的；（iii）有关系实体的配对是错误的。为了了解前两个原因的影响，我们分别分析了HRL和我们的模型在实体生成和关系生成方面的表现。对于实体生成，我们只考虑那些是某元组的一部分的实体。对于关系的生成，我们只考虑元组之间的关系。我们在表5中包括了我们的两个模型和HRL在实体生成和关系生成方面的性能。我们提出的模型在这两项任务上都比HRL更好。比较我们的两个模型，PtrNetDecoding在两个任务上都比WordDecoding表现更好，尽管WordDecoding在元组提取中获得了更高的F1分数。这表明PtrNetDecoding在将实体与关系配对时会产生更多错误。我们进一步分析了我们的模型和HRL的输出，以确定表6中实体顺序（或 der）、第一个实体不匹配（Ent1）和第二个实体不匹配（Ent2）导致的错误。在所有类别中，单词解码产生的错误比其他两种模型都少，因此在两个数据集上都获得了最高的F1分数。

通过表5看出WDec的精确率是最高的，且错误率是各个维度和原因上的最低（同维度比较）

Traditionally, researchers (

Mintz et al. 2009; Riedel,

Yao, and McCallum 2010;

Hoffmann et al. 2011;

Zeng et al.2014; 2015;

Shen and Huang 2016;

Ren et al. 2017;

Zhang et al. 2017;

Jat, Khandelwal, and Talukdar 2017;

Vashishth et al. 2018;

Ye and Ling 2019;

Guo, Zhang, and Lu 2019;

Nayak and Ng 2019) used a pipeline approach for relation tuple extraction where relations were identified using a classification network after all entities were detected.使用管道方法提取关系元组，在检测到所有实体后，使用分类网络识别关系。

Su et al.(2018) used an encoder-decoder model to extract multiple relations present between two given entities.Su等人（2018年）使用编码器-解码器模型提取两个给定实体之间存在的多个关系。

Recently, some researchers (

Katiyar and Cardie 2016;

Miwa and Bansal 2016;

Bekoulis et al. 2018;

Nguyen and Verspoor 2019) tried to bring these two tasks closer together by sharing their parameters and optimizing them together.最近，一些研究人员（；；；）试图通过共享其参数并对其进行优化，使这两项任务更接近。

Zheng et al. (2017) used a sequence tagging scheme to jointly extract the entities and relations.使用序列标记方案联合提取实体和关系

Zeng et al. (2018) proposed an encoder-decoder model with copy mechanism to extract relation tuples with overlapping entities.提出了一种具有复制机制的编码器-解码器模型，用于提取具有重叠实体的关系元组。

Takanobu et al. (2019) proposed a joint extraction model based on reinforcement learning (RL).提出了一种基于强化学习（RL）的联合提取模型。

Fu, Li, and Ma (2019) used a graph convolution network (GCN) where they treated each token in a sentence as a node in a graph and edges were considered as relations.使用了图卷积网络（GCN），他们将句子中的每个标记视为图中的一个节点，并将边视为关系。

Trisedya et al. (2019) used an N-gram attention mechanism with an encoder-decoder model for completion of knowledge bases using distant supervised data.使用N-gram注意机制和编码器-解码器模型，使用远程监督数据完成知识库

Encoder-decoder models have been used for many NLP applications such as neural machine translation (

Sutskever, Vinyals, and Le 2014;

Bahdanau, Cho, and Bengio 2015;

Luong, Pham, and Manning 2015), sentence generation from structured data (

Marcheggiani and Perez-Beltrachini 2018;

Trisedya et al. 2018), and open information extraction (

Zhang, Duh, and Van Durme 2017;

Cui, Wei, and Zhou 2018). Pointer networks (

Vinyals, Fortunato, and Jaitly 2015) have been used to extract a text span from text for tasks such as question answering (

Seo et al. 2017;

Kundu and Ng 2018). For the first time, we use pointer networks with an encoder-decoder model to extract relation tuples from sentences.

编码器解码器模型已被用于许多NLP应用，如神经机器翻译（SutsKyver，Viyales和LE 2014；BaDayAu，CHO和BunoIO 2015；Luong，Pham和Mun宁2015），从结构化数据生成句子（Marcheggiani和Perez Beltrachini 2018；TraseDyA等人2018），以及开放信息提取。（Zhang、Duh和Van Durme 2017；Cui、Wei和Zhou 2018）。指针网络（Vinyals、Fortunato和Jaitly 2015）已被用于从文本中提取文本跨度，用于问答等任务（Seo等人2017；Kundu和Ng 2018）.我们首次使用带有编码器-解码器模型的指针网络从句子中提取关系元组。

Conclusion

Extracting relation tuples from sentences is a challenging task due to different length of entities, the presence of multiple tuples, and overlapping of entities among tuples. In this paper, we propose two novel approaches using encoder-decoder architecture to address this task. Experiments on the New York Times (NYT) corpus show that our proposed models achieve significantly improved new state-of-the-art F1 scores. As future work, we would like to explore our proposed models for a document-level tuple extraction task.

从句子中提取关系元组是一项具有挑战性的任务，因为实体的长度不同，存在多个元组，并且元组之间存在实体重叠。在本文中，我们提出了两种新的方法使用编码器-解码器架构来解决这一任务。在纽约时报（NYT）语料库上的实验表明，我们提出的模型显著提高了最新的F1成绩。作为未来的工作，我们将探索文档级元组提取任务提出的模型。

手画一遍可以帮助理解不少内容，来自文档Figure 1.

你可能感兴趣的:(NLP,机器学习,知识抽取,自然语言处理,知识图谱,机器学习)

1.计算机处理器架构+嵌入式处理器架构及知识 vv 啊 arm-linux学习 linux 系统架构
目录一：x86-64处理器架构二：Intel80386处理器（i386）1.i3862.i686三：嵌入式Linux知识：1.MinGW2.GNU计划2.1GNU工具链概述此次只分享英特尔和ADM处理器有关于x86的架构，至于嵌入式处理器架构请查看https://en.wikipedia.org/wiki/List_of_ARM_processors一：x86-64处理器架构x86-64，也称为x
入伏（五）喜马ma
图片发自App入伏13今天我看见很多蜻蜓在飞飞得很低想起小学时学的那点知识在这闷热的午后真能来一场暴雨太过瘾了入伏14有点像瓢虫的昆虫喜欢吃葡萄叶喜欢交配在炎热的夏天如果你在葡萄园看见两只昆虫它们不是在吃葡萄叶就是在交配请记住它们的名字叫葡萄十星红甲
【嵌入式模块】步进电机使用总结记录无知岁月 #嵌入式设备嵌入式硬件步进电机
关于本博客此前上了一门课《自动控制元件》，但是由于学时有限，讲到步进电机就不讲了，留下了一个小遗憾，导致需要使用步进电机时就有点懵，于是找了一篇博客，链接在这里，推荐具有电机知识（如直流电机，异步电机等）的朋友看，如果完全不懂，建议先啃书。
【转载】SSD测试第一神器——FIO running_sheep
转自：[http://www.ssdfans.com]对于SSD性能测试来说，最好的工具莫过于FIO了。FIO是Jens开发的一个开源测试工具，功能非常强大，本文就只介绍其中一些基本功能。线程，队列深度，Offset，同步异步，DirectIO，BIO使用FIO之前，首先要有一些SSD性能测试的基础知识。线程指的是同时有多少个读或写任务在并行执行，一般来说，CPU里面的一个核心同一时间只能运行一个
计划比盲目做重要一祉微笑
坚持分享第99天。一次次会议提醒着老师们假期余额不足，马上面临停机状态。50多天掐指而过，想想放假时的计划，对照如今的完成情况，感觉差太远。想着好好看书，如今50多天过去了第6本还处在未完待续状态；想着假期好好陪陪孩子，在玩中学一些知识，如今想想，孩子学的真不多；想着暑假坚持跑步，有时还是容易给自己找借口，休息三两天。给这个假期一个综合评价，只能说只完成了计划的百分之五六十。想想为什么临近开学没达
新注册的阿里云账号有哪些优惠？阿里云新用户必看优惠大合集阿里云最新优惠和活动汇总
很多用户看到阿里云各种活动中的云服务器、云数据库、企业邮箱等云产品都仅限新用户购买之后，都纷纷直接注册了阿里云新账号之后购买，其实，阿里云新用户不仅可以优惠购买活动中的各种云产品，还有很多优惠，下面是“阿里云最新优惠和活动汇总”整理汇总的阿里云新用户必看优惠大合集。新注册的阿里云账号在购买活动中的云产品之前，还有免费领云产品通用代金券、抽取无门槛代金券、免费试用云服务器和正式购买云服务器等阿里云产
Flutter运行flutter doctor 命令长时间未响应如何解决咕噜签名分发-淼淼 flutter
Hello大家好！我是咕噜铁蛋！在移动应用开发领域，Flutter以其高效、跨平台的特性吸引了众多开发者的关注。然而，在使用Flutter进行项目开发时，开发者可能会遇到各种问题，其中之一就是运行flutterdoctor命令时长时间未响应。今天铁蛋将深入探讨这一问题的成因、解决方案以及相关的Flutter环境配置知识。一、Flutter与flutterdoctor命令简介Flutter是Goog
ES-LTR粗排模块 poins jenkins 运维
ES-LTR粗排模块官方资源：https://github.com/HeiBoWang/elasticsearch-learning-to-rankElasticsearch学习排名插件使用机器学习提高搜索相关性排名。它为维基媒体基金会和Snagajob等地方的搜索提供了动力！这个插件有什么功能此插件：允许您在Elasticsearch中存储特征（Elasticsearch查询模板）记录特征得分（
python清华大学出版社答案_Python机器学习及实践 weixin_39805119 python清华大学出版社答案
第1章机器学习的基础知识1.1何谓机器学习1.1.1传感器和海量数据1.1.2机器学习的重要性1.1.3机器学习的表现1.1.4机器学习的主要任务1.1.5选择合适的算法1.1.6机器学习程序的步骤1.2综合分类1.3推荐系统和深度学习1.3.1推荐系统1.3.2深度学习1.4何为Python1.4.1使用Python软件的由来1.4.2为什么使用Python1.4.3Python设计定位1.4.
沟通管理和相关方管理核心考点梳理 WorkLee PMP PMP 沟通管理相关方干系人
个人总结，仅供参考，欢迎加好友一起讨论PMP-沟通管理和相关方管理核心考点梳理沟通管理和相关方（干系人）管理这两章放在一起进行梳理，这两章很多的考点很容易混淆，经常会纠结于一些题目，究竟选择沟通管理还是干系人管理的知识点。沟通管理1）规划沟通管理沟通在PMP中是指信息流的传递，PM是根据谁的需求来确定这种信息流的传递方式、频率，内容、格式呢？解析：规划沟通管理是基于每个相关方或相关方群体的信息需求
数据管理知识体系指南（第二版）-第五章——数据建模和设计-学习笔记键盘上的五花肉数据治理数据库数据仓库数据治理
目录5.1引言5.1.1业务驱动因素5.1.2目标和原则5.1.3基本概念5.2活动5.2.1规划数据建模5.2.2建立数据模型5.2.3审核数据模型5.2.4维护数据模型5.3工具5.3.1数据建模工具5.3.2数据血缘工具5.3.3数据分析工具5.3.4元数据资料库5.3.5数据模型模式5.3.6行业数据模型5.4方法5.4.1命名约定的最佳实践5.4.2数据库设计中的最佳实践5.5数据建模和
保持好奇心，约束注意力飞巴
一、快人一步不是运气任何新知识、新方向都不会一开始就闹得沸沸扬扬、人尽皆知，通常善于发现新机遇、新方向的人并不是瞎猫碰着死耗子，可能在一个项目成功之前，他已经尝试过四五次新方向了。保持好奇心，保持对周围事物的敏感度，才有可能发现机会。举一个美国投资大师彼得林奇的例子，他投资的一些大牛股是在陪夫人逛超市的过程中发现，他说：家庭主妇在超级市场或百货商场选购商品时，最有资格发掘好的消费类股票。当然整个投
冷门知识 | 大雪冻skr个人，知道其中三条冷知识温暖整个冬天~ 带你玩儿
今日大雪今日节气——大雪，可是并没有下雪。“嗷~~冻死了”“啊~好冷啊”办公室里的伙伴们一进门都是出奇的一致好吧，那今天的冷知识，就让小八给大家来送些温暖吧~温暖整个冬天的冷知识其实喝酒不能暖身子喝酒只会降低身体内部温度，增加患上低体温症的风险，喝酒也会阻断身体自然颤抖保暖的机制。之所以喝酒会感觉暖暖的，因为酒精让血管扩张，将温暖的血液带到体表。所以喝酒带来的暖是暂时的，最后反而会大大降低身体抵抗
计算机网络知识点汇总蓝小俊
第1章概述P36习题3、7、14、15、17、22、24、262.“协议”与“服务”的异同点？答：（1）协议是控制两个对等实体进行通信的规则的集合。在协议的控制下，两个对等实体间的通信使得本层能够向上一层提供服务，而要实现本层协议，还需要使用下面一层提供服务。（2）协议和服务的概念的区分：1、协议的实现保证了能够向上一层提供服务。本层的服务用户只能看见服务而无法看见下面的协议。下面的协议对上面的服
记录2022-05-15 果果圆
计划坚持周更，去记录生活。上周计划：①坚持练字5天，每天至少20min；②学习新内容，通过实践回顾曾学知识；③做运动，5天。④每天背单词。完成度：①练字2/5；②学习进度还不错；③运动3/5；④单词6/7。加入了一个单词小组，队友和我每天都按时打卡的情况下，APP给出的当日奖励会更高，并且最终可以瓜分奖池（奖池指的不是money哦，是APP上的一种虚拟币），当然如果有人两次没打卡，整个队伍也会失去
UNDERSTANDING HTML WITH LARGE LANGUAGE MODELS liferecords LLM 语言模型人工智能自然语言处理
UNDERSTANDINGHTMLWITHLARGELANGUAGEMODELS相关链接：arXiv关键字：大型语言模型、HTML理解、Web自动化、自然语言处理、机器学习摘要大型语言模型（LLMs）在各种自然语言任务上表现出色。然而，它们在HTML理解方面的能力——即解析网页的原始HTML，对于自动化基于Web的任务、爬取和浏览器辅助检索等应用——尚未被充分探索。我们为HTML理解模型（经过微调
Java回溯知识点（含面试大厂题和源码）一成码农 java 面试开发语言
回溯算法是一种通过遍历所有可能的候选解来寻找所有解的算法，如果候选解被确认不是一个解（或至少不是最后一个解），回溯算法会通过在上一步进行一些变化来丢弃这个解，即“回溯”并尝试另一个候选解。回溯法通常用递归方法来实现，在解决排列、组合、选择问题时非常有效。回溯算法的核心要点：路径：也就是已经做出的选择。选择列表：也就是你当前可以做的选择。结束条件：也就是到达决策树底层，无法再做出选择的条件。回溯算法
Java面试题：解释JVM的内存结构，并描述堆、栈、方法区在内存结构中的角色和作用，Java中的多线程是如何实现的，Java垃圾回收机制的基本原理，并讨论常见的垃圾回收算法杰哥在此 Java系列 java jvm 算法面试
Java内存模型与多线程的深入探讨在Java的世界里，内存模型和多线程是开发者必须掌握的核心知识点。它们不仅关系到程序的性能和稳定性，还直接影响到系统的可扩展性和可靠性。下面，我将通过三个面试题，带领大家深入理解Java内存模型、多线程以及并发编程的相关原理和实践。面试题一：请解释JVM的内存结构，并描述堆、栈、方法区在内存结构中的角色和作用。关注点：JVM内存结构的基本组成堆、栈、方法区的功能和
ChatGPT技巧大揭秘：AI写代码新境界 2401_83550420 chatgpt4.0 chatgpt chatgpt 人工智能 AI写作
ChatGPT无限次数:点击直达ChatGPT技巧大揭秘：AI写代码新境界随着人工智能技术的不断进步，开发人员现在有了更多有趣的工具来提高他们的工作效率。其中，ChatGPT作为一种基于深度学习的自然语言处理模型，已经成为许多开发者的新宠。在本文中，我们将揭秘使用ChatGPT来帮助编写代码的技巧，探索AI在编程领域的新境界。ChatGPT简介ChatGPT是一种基于大型神经网络的对话生成模型，它
AI大模型学习：开启智能时代的新篇章游向大厂的咸鱼人工智能学习
随着人工智能技术的不断发展，AI大模型已经成为当今领先的技术之一，引领着智能时代的发展。这些大型神经网络模型，如OpenAI的GPT系列、Google的BERT等，在自然语言处理、图像识别、智能推荐等领域展现出了令人瞩目的能力。然而，这些模型的背后是一系列复杂的学习过程，深度学习技术的不断演进推动了AI大模型学习的发展。首先，AI大模型学习的基础是深度学习技术。深度学习是一种模仿人类大脑结构的机器
牛郎织女罗曼史潮汐_d5d4
牛郎织女是我国四大民间传说之一，牛郎织女的爱情故事家喻户晓，对他们的真挚感情热情讴歌。那么牛郎和织女是怎么走到一起的呢？单身阶段北斗九星牛郎织女起源于天文，诗经中出现了银河和织女、牵牛星宿，但尚未形成传说故事，还没有后来的情节，这个时候牛郎和织女还是单身。河南郑州青台遗址--陶罐北斗九星天文遗迹，说明5000多年前先民就对天文进行了细致的观察，具备了一定的天文知识，并将观察的结果应用到生活场景中。
OpenCV（一个C++人工智能领域重要开源基础库）简介愚梦者 OpenCV 人工智能人工智能 opencv c++图像处理计算机视觉开源
返回：OpenCV系列文章目录（持续更新中......）上一篇：OpenCV4.9.0配置选项参考下一篇：OpenCV4.9.0开源计算机视觉库安装概述引言：OpenCV（全称OpenSourceComputerVisionLibrary）是一个基于开放源代码发行的跨平台计算机视觉库，可以用来进行图像处理、计算机视觉和机器学习等领域的开发。该库由英特尔公司于1999年开始开发，最初是为了加速处理器
遇见美好｜期待越来越好的自己｜复盘日记Day137 沫ma的1001页
遇见美好｜期待越来越好的自己｜复盘日记Day1372021年7月21日星期三晴喜马拉雅(沫沫成长记）亲子共读：Day42阅读学习践行Day.17/21晨间日记Day.17/21昨日晚安：23:02今日早安：05:00早起：Day806❥今日运动｜跑步0Km（未完成）❥今日自我成长｜学习新知识1.听书＋书写笔记,小花生阅读打卡2..阅读学习，听音频＋写作业3.时间管理2.0线上践行，听课+写作业4.
【循环神经网络rnn】一篇文章讲透 CX330的烟花 rnn 人工智能深度学习算法 python 机器学习数据结构
目录引言二、RNN的基本原理代码事例三、RNN的优化方法1长短期记忆网络（LSTM）2门控循环单元（GRU）四、更多优化方法1选择合适的RNN结构2使用并行化技术3优化超参数4使用梯度裁剪5使用混合精度训练6利用分布式训练7使用预训练模型五、RNN的应用场景1自然语言处理2语音识别3时间序列预测六、RNN的未来发展七、结论引言众所周知，CNN与循环神经网络（RNN）或生成对抗网络（GAN）等算法结
人情社会，这些心理学知识你了解吗？小书精
1首因效应首因效应说的就是初次印象的重要性。心理学有个关于第一印象的7/38/55定律，一个人留给他人第一印象，说话内容占7%，说话方式（语速，语调，音量等）占38%，非语言信息（面部表情，行为，服饰等）占55%,，可见人的外在信息是给他人的印象占举足轻重的份量。2肢体语言人与人在交流沟通时，即使不说话，都可以根据对方的身体语言来探索他内心的秘密，对方也会从我们的身体表现来了解我们的真实想法。可谓
26/100参加二一设计线下课程前的一点思考设计师周文斌
再过两天就要参加21设计线下的课程了，对此很兴奋也很期待。平时很少参加线下课程，一来费用高花费的时间场成本肯定是非常大的，首先学费是6000再加上机票1500住宿1000吃饭500左右，就是总共8000元，这些年参加的线下课程也很多，但真正能落地，或者说是有实际行动的却是少之又少，所以这次线下课程是怀着一个学习的心态能落地的心态去参加的。因此要让知识课程学有所值，就一定要全新的投入进去。实习的目的
【0719+0720复盘】沙皮狗的快乐
时间执行上的有点差，没有按照正常的安排去落地，好多事都乱了顺序去进行了。一个月了，都是各自忙自己的事，没有特别的去打扰对方的工作事情，嗯，生活自由？不是，都是为了想让彼此的对方更好，她的成长是彼此间成长的动力。小火炉就是扮演这样的角色。利他这个利他，不管是多大的事多大的知识点，有学到好的知识点，就去分享给周边有需要的或者没需要的小伙伴，这样的事不止一次两次见到过，小逸身上有，公司整个团队身上也有。
零基础机器学习(5)之线性回归模型的性能评估一只特立独行猪机器学习机器学习线性回归人工智能
文章目录线性回归模型的性能评估1.举例1-单一特征2.举例2-多特征线性回归模型的性能评估评估线性回归模型时，首先要建立评估的测试数据集（测试集不能与训练集相同），然后选择合适的评估方法，实现对线性回归模型的评估。回归任务中最常用的评估方法有均方误差、均方根误差和预测准确率（确定系数）。1.举例1-单一特征分别对两个模型进行评估，输入的测试集如表所示。面积/（m2）售价/（万元）面积/（m2）售价
计算机网络复试总结（五） interee 面试计算机网络
可能会问：基础知识问题：请简述TCP/IP协议栈的层次结构及其功能。TCP/IP协议栈的层次结构及其功能可以简要概述如下：层次结构：TCP/IP协议栈通常被划分为四个主要层次，从底层到高层分别是网络接口层（也称为链路层或数据链路层）、网络层（也称为网际网层）、传输层和应用层。这四个层次协同工作，实现数据的封装、传输和解析，从而完成网络通信任务。功能概述：网络接口层：这是TCP/IP协议栈的最底层，
ChatGPT神技：AI成为你的编程良友 2401_83481083 chatgpt4.0 chatgpt chatgpt 人工智能 AI写作
ChatGPT无限次数:点击直达ChatGPT神技：AI成为你的编程良友近年来，人工智能技术的发展迅猛，ChatGPT作为其中一项创新技术，正逐渐走进我们的生活。在编程领域，AI不仅可以助力我们提高效率，还能成为我们的良友，帮助解决各种编程难题。一、ChatGPT简介ChatGPT是一种基于自然语言处理技术的人工智能模型，它能够生成类人对话。ChatGPT通过深度学习模型，能够理解输入的文本并生成
Nginx负载均衡 510888780 nginx 应用服务器
Nginx负载均衡一些基础知识: nginx 的 upstream目前支持 4 种方式的分配 1)、轮询（默认）每个请求按时间顺序逐一分配到不同的后端服务器，如果后端服务器down掉，能自动剔除。 2)、weight 指定轮询几率，weight和访问比率成正比
RedHat 6.4 安装 rabbitmq bylijinnan erlang rabbitmq redhat
在 linux 下安装软件就是折腾，首先是测试机不能上外网要找运维开通，开通后发现测试机的 yum 不能使用于是又要配置 yum 源，最后安装 rabbitmq 时也尝试了两种方法最后才安装成功机器版本： [root@redhat1 rabbitmq]# lsb_release LSB Version: :base-4.0-amd64:base-4.0-noarch:core
FilenameUtils工具类 eksliang FilenameUtils common-io
转载请出自出处：http://eksliang.iteye.com/blog/2217081 一、概述这是一个Java操作文件的常用库，是Apache对java的IO包的封装，这里面有两个非常核心的类FilenameUtils跟FileUtils，其中FilenameUtils是对文件名操作的封装;FileUtils是文件封装，开发中对文件的操作，几乎都可以在这个框架里面找到。非常的好用。
xml文件解析SAX 不懂事的小屁孩 xml
xml文件解析:xml文件解析有四种方式， 1.DOM生成和解析XML文档(SAX是基于事件流的解析) 2.SAX生成和解析XML文档(基于XML文档树结构的解析) 3.DOM4J生成和解析XML文档 4.JDOM生成和解析XML 本文章用第一种方法进行解析，使用android常用的DefaultHandler import org.xml.sax.Attributes;
通过定时任务执行mysql的定期删除和新建分区，此处是按日分区酷的飞上天空 mysql
使用python脚本作为命令脚本，linux的定时任务来每天定时执行 #!/usr/bin/python # -*- coding: utf8 -*- import pymysql import datetime import calendar #要分区的表 table_name = 'my_table' #连接数据库的信息 host,user,passwd,db =
如何搭建数据湖架构？听听专家的意见蓝儿唯美架构
Edo Interactive在几年前遇到一个大问题：公司使用交易数据来帮助零售商和餐馆进行个性化促销，但其数据仓库没有足够时间去处理所有的信用卡和借记卡交易数据 “我们要花费27小时来处理每日的数据量，”Edo主管基础设施和信息系统的高级副总裁Tim Garnto说道：“所以在2013年，我们放弃了现有的基于PostgreSQL的关系型数据库系统，使用了Hadoop集群作为公司的数
spring学习——控制反转与依赖注入 a-john spring
控制反转（Inversion of Control，英文缩写为IoC）是一个重要的面向对象编程的法则来削减计算机程序的耦合问题，也是轻量级的Spring框架的核心。控制反转一般分为两种类型，依赖注入（Dependency Injection，简称DI）和依赖查找（Dependency Lookup）。依赖注入应用比较广泛。
用spool+unixshell生成文本文件的方法 aijuans xshell
例如我们把scott.dept表生成文本文件的语句写成dept.sql,内容如下: 　　set pages 50000; 　　set lines 200; 　　set trims on; 　　set heading off; 　　spool /oracle_backup/log/test/dept.lst; 　　select deptno||','||dname||','||loc
1、基础--名词解析(OOA/OOD/OOP) asia007 学习基础知识
OOA:Object-Oriented Analysis（面向对象分析方法）是在一个系统的开发过程中进行了系统业务调查以后，按照面向对象的思想来分析问题。OOA与结构化分析有较大的区别。OOA所强调的是在系统调查资料的基础上，针对OO方法所需要的素材进行的归类分析和整理，而不是对管理业务现状和方法的分析。　　OOA（面向对象的分析）模型由5个层次（主题层、对象类层、结构层、属性层和服务层）
浅谈java转成json编码格式技术百合不是茶 json编码 java转成json编码
json编码;是一个轻量级的数据存储和传输的语言在java中需要引入json相关的包,引包方式在工程的lib下就可以了 JSON与JAVA数据的转换（JSON 即 JavaScript Object Natation，它是一种轻量级的数据交换格式，非常适合于服务器与 JavaScript 之间的数据的交
web.xml之Spring配置(基于Spring+Struts+Ibatis) bijian1013 java web.xml SSI spring配置
指定Spring配置文件位置 <context-param> <param-name>contextConfigLocation</param-name> <param-value> /WEB-INF/spring-dao-bean.xml,/WEB-INF/spring-resources.xml, /WEB-INF/
Installing SonarQube（Fail to download libraries from server） sunjing Install Sonar
1. Download and unzip the SonarQube distribution 2. Starting the Web Server The default port is "9000" and the context path is "/". These values can be changed in &l
【MongoDB学习笔记十一】Mongo副本集基本的增删查 bit1129 mongodb
一、创建复本集假设mongod,mongo已经配置在系统路径变量上，启动三个命令行窗口，分别执行如下命令： mongod --port 27017 --dbpath data1 --replSet rs0 mongod --port 27018 --dbpath data2 --replSet rs0 mongod --port 27019 -
Anychart图表系列二之执行Flash和HTML5渲染白糖_ Flash
今天介绍Anychart的Flash和HTML5渲染功能 HTML5 Anychart从6.0第一个版本起，已经逐渐开始支持各种图的HTML5渲染效果了，也就是说即使你没有安装Flash插件，只要浏览器支持HTML5，也能看到Anychart的图形（不过这些是需要做一些配置的）。这里要提醒下大家，Anychart6.0版本对HTML5的支持还不算很成熟，目前还处于
Laravel版本更新异常4.2.8-> 4.2.9 Declaration of ... CompilerEngine ... should be compa bozch laravel
昨天在为了把laravel升级到最新的版本，突然之间就出现了如下错误： ErrorException thrown with message "Declaration of Illuminate\View\Engines\CompilerEngine::handleViewException() should be compatible with Illuminate\View\Eng
编程之美-NIM游戏分析-石头总数为奇数时如何保证先动手者必胜 bylijinnan 编程之美
import java.util.Arrays; import java.util.Random; public class Nim { /**编程之美 NIM游戏分析问题：有N块石头和两个玩家A和B，玩家A先将石头随机分成若干堆，然后按照BABA...的顺序不断轮流取石头，能将剩下的石头一次取光的玩家获胜，每次取石头时，每个玩家只能从若干堆石头中任选一堆，
lunce创建索引及简单查询 chengxuyuancsdn 查询创建索引 lunce
import java.io.File; import java.io.IOException; import org.apache.lucene.analysis.Analyzer; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Docume
[IT与投资]坚持独立自主的研究核心技术 comsci it
和别人合作开发某项产品....如果互相之间的技术水平不同,那么这种合作很难进行,一般都会成为强者控制弱者的方法和手段..... 所以弱者,在遇到技术难题的时候,最好不要一开始就去寻求强者的帮助,因为在我们这颗星球上,生物都有一种控制其
flashback transaction闪回事务查询 daizj oracle sql 闪回事务
闪回事务查询有别于闪回查询的特点有以下3个：（1）其正常工作不但需要利用撤销数据，还需要事先启用最小补充日志。（2）返回的结果不是以前的“旧”数据，而是能够将当前数据修改为以前的样子的撤销SQL（Undo SQL）语句。（3）集中地在名为flashback_transaction_query表上查询，而不是在各个表上通过“as of”或“vers
Java I/O之FilenameFilter类列举出指定路径下某个扩展名的文件游其是你 FilenameFilter
这是一个FilenameFilter类用法的例子，实现的列举出“c:\\folder“路径下所有以“.jpg”扩展名的文件。 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
C语言学习五函数，函数的前置声明以及如何在软件开发中合理的设计函数来解决实际问题 dcj3sjt126com c
# include <stdio.h> int f(void) //括号中的void表示该函数不能接受数据，int表示返回的类型为int类型 { return 10; //向主调函数返回10 } void g(void) //函数名前面的void表示该函数没有返回值 { //return 10; //error 与第8行行首的void相矛盾 } in
今天在测试环境使用yum安装，遇到一个问题： Error: Cannot retrieve metalink for repository: epel. Pl dcj3sjt126com centos
今天在测试环境使用yum安装，遇到一个问题： Error: Cannot retrieve metalink for repository: epel. Please verify its path and try again 处理很简单，修改文件“/etc/yum.repos.d/epel.repo”，将baseurl的注释取消， mirrorlist注释掉。即可。 &n
单例模式 shuizhaosi888 单例模式
单例模式懒汉式 public class RunMain { /** * 私有构造 */ private RunMain() { } /** * 内部类，用于占位，只有 */ private static class SingletonRunMain { priv
Spring Security（09）——Filter 234390216 Spring Security
Filter 目录 1.1 Filter顺序 1.2 添加Filter到FilterChain 1.3 DelegatingFilterProxy 1.4 FilterChainProxy 1.5
公司项目NODEJS实践0.1 逐行分析JS源代码 mongodb nginx ubuntu nodejs
一、前言前端如何独立用nodeJs实现一个简单的注册、登录功能，是不是只用nodejs+sql就可以了？其实是可以实现，但离实际应用还有距离，那要怎么做才是实际可用的。网上有很多nod
java.lang.Math liuhaibo_ljf java Math lang
System.out.println(Math.PI); System.out.println(Math.abs(1.2)); System.out.println(Math.abs(1.2)); System.out.println(Math.abs(1)); System.out.println(Math.abs(111111111)); System.out.println(Mat
linux下时间同步 nonobaba ntp
今天在linux下做hbase集群的时候，发现hmaster启动成功了，但是用hbase命令进入shell的时候报了一个错误 PleaseHoldException: Master is initializing，查看了日志，大致意思是说master和slave时间不同步，没办法，只好找一种手动同步一下，后来发现一共部署了10来台机器，手动同步偏差又比较大，所以还是从网上找现成的解决方
ZooKeeper3.4.6的集群部署 roadrunners zookeeper 集群部署
ZooKeeper是Apache的一个开源项目，在分布式服务中应用比较广泛。它主要用来解决分布式应用中经常遇到的一些数据管理问题，如：统一命名服务、状态同步、集群管理、配置文件管理、同步锁、队列等。这里主要讲集群中ZooKeeper的部署。 1、准备工作我们准备3台机器做ZooKeeper集群，分别在3台机器上创建ZooKeeper需要的目录。数据存储目录
Java高效读取大文件 tomcat_oracle java
　　读取文件行的标准方式是在内存中读取，Guava 和Apache Commons IO都提供了如下所示快速读取文件行的方法：　　Files.readLines(new File(path), Charsets.UTF_8); 　　FileUtils.readLines(new File(path)); 　　这种方法带来的问题是文件的所有行都被存放在内存中，当文件足够大时很快就会导致
微信支付api返回的xml转换为Map的方法 xu3508620 xml map 微信api
举例如下： <xml> <return_code><![CDATA[SUCCESS]]></return_code> <return_msg><![CDATA[OK]]></return_msg> <appid><

文献学习02-Effective Modeling of Encoder-Decoder Architcture for Joint Entity and Relation Extraction

Abstract

Introduction

Task Description

Encoder-Decoder Architecture

Embedding Layer & Encoder

Word-level Decoder & Copy Mechanism

Pointer Network-based Decoder

Relation Tuple Extraction

Attention Modeling

Loss Function

Experiments

Datasets

Parameter Settings

Baseline and Evaluation Metrics

Experimental Results

Analysis and Discussion

Ablation Studies

Performance Analysis

Error Analysis

Related Work

Conclusion

你可能感兴趣的:(NLP,机器学习,知识抽取,自然语言处理,知识图谱,机器学习)