Commonsense Knowledge Enhanced Embeddings for Solving Pronoun Disambiguation Problems in Winograd Schema Challenge

作者:Quan Liua, Hui Jiang...

A Winograd schema question is a pair of sentences that differ only in one to two words which results in a different resolution of coreference.

the commonsense knowledge would be quantized as semantic constraints to guids the semantic word embedding training process.用常识知识来指导word embedding的获得

the process to answer PDP problems could be fullfilled by directly calculating the semantic similarities between the representation vectors of the pronoun under concern and all candidate mentions.最终将PDP问题看作一个向量相似度计算的问题。

PDP task does't have training data.所以要么采用无监督的方法训练,要么想办法获得训练数据。

PDP问题不涉及三个句子及以上的例子。

The difficulties of solving the comples PDP problems:1 The lack of training data. 2 The requirement of commonsense reasoning.

作者将PDP问题当作一个典型的机器学习的问题,但是面临没有训练数据的问题,所以最后选择了设计一个problem solver.

Commonsense knowledge in Cyc are represented by formal language. 

实验中使用的常识知识库有三个:ConceptNet, WordNet and CauseCom.

ConcepNet: Similarities between linked concepts should be larger than the similarities between unlinked concepts.

WordNet: 1)similarities between a word and its synonymous words are larger than similsrities between the word and its antonymous words. 2)Similarities of words that belong to the same semantic category would be larger than similarities of words that belong to different categories. 3)Similarities between words that have shorter distances in a semantic hierarchy should be larger than similarities of words that have longer distance.

CauseCom: similarity based on wither one word effect the other word. CauseCom里主要包含了verbs and adjectives, not include nouns, adverbs and prepositions.

word embedding 采取的是skip-gram model,and regard commonsense knowledge as a penalty term. 

训练word embedding 的语料为:1) CBTest: a book corpus, containing 300 million tokensand and 53541 words 2)Wikipedia: 1 bilion tokens and 235167 words. \\

The embedding dimention is 100

The window size is 5.

The penalty term combination coefficient beta is 0.01.

The learning rate for SGD is 0.025.

This paper uses the popular coreference resolution datasets: OntoNotes , to extract labelled mention pairs for model training.

目前PDP问题最好的准确率是论文中的66.7%。

评论:目前解决PDP的关键在于相关常识知识的获取。

你可能感兴趣的:(Commonsense Knowledge Enhanced Embeddings for Solving Pronoun Disambiguation Problems in Winograd Schema Challenge)