论文记录:Reasoning with neural tensor networks for Knowledge Base completion

论文记录:Reasoning with neural tensor networks for Knowledge Base completion

  • Paper points
    • Neural tensor networks
    • Entity embedding AS average word embeddings
    • Pretrained word embedding vectors used in initialization
  • Conclusion

Paper points

该论文用NTN(Neural tensor networks)进行知识库中嵌入对象(Embedding)之间关系(Relation)的学习和预测(完善)。主要考虑了3个重要思路。

Neural tensor networks

Triple(e1, R, e2), which means relation R keeps between e1 and e2. Model built to score the existing triples high and score negative samples low. Score function in this paper is constructed as a bilinear one, which consists of e1*M* e2, V*[e1, e2]t, b.
In my opinion, e1* M * e2 represents inter-operations, V*[e1, e2]’ represents linear combination, b is bias, where M, V and b are parameters of the model for specific relation.
Loss function is constructed as contrastive max-marginal distance. For existing(positive) triple (e1, R, e2), randomly sample e1 or e2 to be replaced with some randomly chosen entity e’ to form negative relation(e, R, e) , then loss can be defined as
max(0, 1 - (g(e1, R, e2) - g(e, R, e))) plus L2-regularization. g is the score function.

Entity embedding AS average word embeddings

The combination of word information utilizes semantical information, which is useful to mine underlying text as new information sources.
It improves model performance more than NTN model does.

Pretrained word embedding vectors used in initialization

Pre-trained word embeddings as initialization, incorporates large scale corpus distributions, make results even better.

Conclusion

Information sources are important for models, not necessarily new ones, sometimes the ignored ones.

Vector relations can be modeled as inter-operations as bilinear functions, or as traditionally linear combinations. This can be understood as variable multiplication and summation ops.

Contrastive max-margin is a often-used loss function construction method for non-probabilistic problems

Negative sampling is widely used to construct negative samples

Word embeddings are import information sources for KB-like text based applications.


你可能感兴趣的:(paper,KB,embedding,entity,name,info,NTN)