dependency parsing的两种解决方案

  1. Transition-based的依存解析方法
    解析过程:首先设计一系列action, 其就是有方向带类型的边,接着从左向右依次解析句子中的每一个词,解析词的同时通过选择某一个action开始增量构建依存树, 直到句子中的词解析完.
    优点:解析过程是线性的, operations步骤随句子长度线性增长
    挑战:在解析的每一步都只是利用局部信息,会导致错误传播,性能比graph-based效果略差
    目前工作:

  2. Graph-based的依存解析方法
     解析过程:学习一个打分函数,针对一句话在所有可能的解析结果(解析的依存树)中执行全局的穷举搜索,得到一个打分最高的解析树.
     优点:目前效果相比transition-based较好
     挑战:搜索的过程速度很慢
     目前工作:

下面这段文字出自paper
Transition-based dependency parsers read words sequentially (commonly from left-to-right)
and build dependency trees incrementally by making series of multiple choice decisions. The
advantage of this formalism is that the number of operations required to build any projective parse tree is linear with respect to the length of the sentence. The challenge, however, is that the decision made at each step is based on local information, leading to error propagation and worse performance compared to graph-based parsers on root and long dependencies (McDonald and Nivre, 2011). Previous studies have explored solutions to address this challenge. Stack LSTMs (Dyer et al., 2015; Ballesteros et al., 2015, 2016) are capable of learning representations of the parser state that are sensitive to the complete contents of
the parser’s state. Andor et al. (2016) proposed a globally normalized transition model to replace the locally normalized classifier. However, the parsing accuracy is still behind state-of-the-art graph-based parsers (Dozat and Manning, 2017).

Graph-based dependency parsers, on the other hand, learn scoring functions for parse trees and perform exhaustive search over all possible trees for a sentence to find the globally highest scoring arXiv:1805.01087v1 [cs.CL] 3 May 2018 tree. Incorporating this global search algorithm
with distributed representations learned from neural networks, neural graph-based parsers (Kiperwasser and Goldberg, 2016; Wang and Chang, 2016; Kuncoro et al., 2016; Dozat and Manning, 2017) have achieved the state-of-the-art accuracies on a number of treebanks in different languages. Nevertheless, these models, while accurate, are usually slow (e.g. decoding is O(n3) time complexity for first-order models (McDonald et al., 2005a,b) and higher polynomials for higherorder models (McDonald and Pereira, 2006; Koo and Collins, 2010; Ma and Zhao, 2012b,a)).

你可能感兴趣的:(Machine,Learning)