循环神经网络语言模型 recurrent neural network language model

1. 什么是language model?

A statistical language model is a probability distribution over sequences of words. Given such a sequence, say of length m, it assigns a probability. to the whole sequence. The language model provides context to distinguish between words and phrases that sound similar.

2. 有哪些language model?

unigram: 每个word的出现概率只与自己有关,是固定的。
n-gram:每个word的出现概率与前n-1个word有关(context)。
循环神经网络语言模型 recurrent neural network language model_第1张图片
Bidirectional: Bidirectional representations condition on both pre- and post- context (e.g., words) in all layers.

Exponential: Maximum entropy language models encode the relationship between a word and the n-gram history using feature functions.
循环神经网络语言模型 recurrent neural network language model_第2张图片
Neural network language models (NNLM, word2vec): Neural language models (or continuous space language models) use continuous representations or embeddings of words to make their predictions.These models make use of Neural networks.

3. 循环神经网络语言模型 - RNN language model

循环神经网络语言模型 recurrent neural network language model_第3张图片
y^为predicted probability distribution, y为target (one-hot vector with one 1 and all other 0s)

we want to maximize the product of all probabilities = maximize the cross entropy of y^ and y for each word.

a is the hidden state which incorporates all the previous context information before the centre word.

reference:

[1] https://www.bilibili.com/video/BV18b411p7KT
[2] Language model - Wikipediaen.wikipedia.org › wiki › Language_model

你可能感兴趣的:(NLP)