自然语言处理:依存句法解析(Dependency Parsing)

文章目录

  • 短语结构语法
  • 为什么需要依存句法结构?
  • 依存句法结构
  • Transition-based dependency parsers
  • Why train a neural dependency parser?

自然语言的结构有哪些?怎么建立句法结构模型?大致可将句法结构分为两种:

  • phrase structure (context-free grammars), that organizes words into nested constituents;
  • dependency structure, that shows which words depend on (modify or are arguments of) which other words;

短语结构语法

句子有逐步嵌套的单元构成,我们可以将相邻单元/单词组合为更大的单元/单词,称之为短语或词组,然后继续将组合后的短语或词组组合为更大的单元:

自然语言处理:依存句法解析(Dependency Parsing)_第1张图片

为什么需要依存句法结构?

依存句法可以解释句子不同单元的联系,相同句子可能具有不能的依存结构,不同依存结构可能具有较大的语义差异。 因此,依据依存句法可以更好的理解句子,提升机器翻译等任务的准确性。


介词短语附着歧义(Prepositional phrase attachment ambiguity)

San Jose cops kill man with knife

There are two meanings of this sentence:

  • the cops stabs that guy;
  • the man has a knife;

Scientists count whales from space

There are two meanings of this sentence:

  • scientists counting the whales from space using something like a satellite;
  • the whales come from space;
自然语言处理:依存句法解析(Dependency Parsing)_第2张图片

对等范围歧义(Coordination scope ambiguity)

Shuttle veteran and longtime NASA executive Fred Gregory appointed to board

There are two meanings of this sentence:

  • a man is shuttle veteran and NASA executive;
  • shuttle veteran and NASA executive both of them have been appointed to the board;

依赖路径识别语义关系(Dependency paths identify semantic relations)

The results demonstrated that KaiC interacts rhythmically with SaSA KaiA and KaiB

We can get out of protein-protein interaction in dependency analysis, such as KaiC interacting with there other proteins over there.

自然语言处理:依存句法解析(Dependency Parsing)_第3张图片

The noun subjects here interacts with a noun modifier, and then it’s going to be there things that are beneath that of the SasA, and its conjoin things KaiA and KaiB are the things that interacts with.


依存句法结构

依存句法结构可用两种方法表示:线型结构表示树形结构表示,如下左右两幅图片:

自然语言处理:依存句法解析(Dependency Parsing)_第4张图片 自然语言处理:依存句法解析(Dependency Parsing)_第5张图片

The Rise of Annotated Data: Universal Dependencies Treebanks

自然语言处理:依存句法解析(Dependency Parsing)_第6张图片

依存句法开源标注集,涉及多种语言.


依存句法构建方法:

  • dynamic programming, complexity is O(n3);
  • graph algorithms;
  • constraint satisfaction;
  • “transition-based parsing” or “deterministic dependency parsing”;

Transition-based dependency parsers

Arc-standard transition-based parser

Analysis of Happy children like to play with their friends.

自然语言处理:依存句法解析(Dependency Parsing)_第7张图片

Actually, it had different choices of when to shift and when to reduce. You would’ve explored this exponential size of different possible parsers, that would be able to parse efficiently.

In the 60s, it can be come up with clever dynamic programming algorithms by relatively efficiently explore the space of all possible parsers.

It’s the 2000s (MaltParser), at a particular position in the parse and each action is predicted by a discriminative classifier (e.g. softmax classifier) over each legal more:

  • max of 3 choices/actions when untyped ; max of |R| x 2 + 1 when typed;
  • features: top of stack word, POS; first in buffer word, POS; etc

There is NO search (in the simplest form), but you can profitably do a beam search if you wish (slower but better), keep k good parse prefixes at each time step.


Evaluation of dependency parsing

  • unlabeled attachment score (UAS) , 正确标记关系的比率;
  • Labeled attachment score (LAS), 正确标记关系且关系标签正确的比率 ;
自然语言处理:依存句法解析(Dependency Parsing)_第8张图片

Why train a neural dependency parser?

  • indicated features that people hand-engineer were very sparse, and tend to be incomplete;
  • millions of features computation was just expensive;

你可能感兴趣的:(自然语言处理,深度学习)