自然语言的结构有哪些?怎么建立句法结构模型?大致可将句法结构分为两种:
phrase structure
(context-free grammars), that organizes words into nested constituents;dependency structure
, that shows which words depend on (modify or are arguments of) which other words;句子有逐步嵌套的单元构成,我们可以将相邻单元/单词组合为更大的单元/单词,称之为短语或词组,然后继续将组合后的短语或词组组合为更大的单元:
依存句法可以解释句子不同单元的联系,相同句子可能具有不能的依存结构,不同依存结构可能具有较大的语义差异。 因此,依据依存句法可以更好的理解句子,提升机器翻译等任务的准确性。
介词短语附着歧义(Prepositional phrase attachment ambiguity)
San Jose cops kill man with knife
There are two meanings of this sentence:
Scientists count whales from space
There are two meanings of this sentence:
对等范围歧义(Coordination scope ambiguity)
Shuttle veteran and longtime NASA executive Fred Gregory appointed to board
There are two meanings of this sentence:
依赖路径识别语义关系(Dependency paths identify semantic relations)
The results demonstrated that KaiC interacts rhythmically with SaSA KaiA and KaiB
We can get out of protein-protein interaction in dependency analysis, such as KaiC interacting with there other proteins over there.
The noun subjects here interacts with a noun modifier, and then it’s going to be there things that are beneath that of the SasA, and its conjoin things KaiA and KaiB are the things that interacts with.
依存句法结构可用两种方法表示:线型结构表示、树形结构表示,如下左右两幅图片:
The Rise of Annotated Data: Universal Dependencies Treebanks
依存句法开源标注集,涉及多种语言.
依存句法构建方法:
Arc-standard transition-based parser
Analysis of Happy children like to play with their friends.
Actually, it had different choices of when to shift and when to reduce. You would’ve explored this exponential size of different possible parsers, that would be able to parse efficiently.
In the 60s, it can be come up with clever dynamic programming algorithms
by relatively efficiently explore the space of all possible parsers.
It’s the 2000s (MaltParser
), at a particular position in the parse and each action is predicted by a discriminative classifier (e.g. softmax classifier) over each legal more:
There is NO search (in the simplest form), but you can profitably do a beam search if you wish (slower but better), keep k good parse prefixes at each time step.
Evaluation of dependency parsing
sparse
, and tend to be incomplete
;