phrase structure
(context-free grammars), that organizes words into nested constituents;dependency structure
, that shows which words depend on (modify or are arguments of) which other words;句子有逐步嵌套的单元构成,我们可以将相邻单元/单词组合为更大的单元/单词,称之为短语或词组,然后继续将组合后的短语或词组组合为更大的单元:
依存句法可以解释句子不同单元的联系,相同句子可能具有不能的依存结构,不同依存结构可能具有较大的语义差异。 因此,依据依存句法可以更好的理解句子,提升机器翻译等任务的准确性。
介词短语附着歧义(Prepositional phrase attachment ambiguity)
San Jose cops kill man with knife
There are two meanings of this sentence:
Scientists count whales from space
There are two meanings of this sentence:
对等范围歧义(Coordination scope ambiguity)
Shuttle veteran and longtime NASA executive Fred Gregory appointed to board
There are two meanings of this sentence:
依赖路径识别语义关系(Dependency paths identify semantic relations)
The results demonstrated that KaiC interacts rhythmically with SaSA KaiA and KaiB
We can get out of protein-protein interaction in dependency analysis, such as KaiC interacting with there other proteins over there.
The noun subjects here interacts with a noun modifier, and then it’s going to be there things that are beneath that of the SasA, and its conjoin things KaiA and KaiB are the things that interacts with.
The Rise of Annotated Data: Universal Dependencies Treebanks
Arc-standard transition-based parser
Analysis of Happy children like to play with their friends.
Actually, it had different choices of when to shift and when to reduce. You would’ve explored this exponential size of different possible parsers, that would be able to parse efficiently.
In the 60s, it can be come up with clever dynamic programming algorithms
by relatively efficiently explore the space of all possible parsers.
It’s the 2000s (MaltParser
), at a particular position in the parse and each action is predicted by a discriminative classifier (e.g. softmax classifier) over each legal more:
There is NO search (in the simplest form), but you can profitably do a beam search if you wish (slower but better), keep k good parse prefixes at each time step.
Evaluation of dependency parsing
, and tend to be incomplete