吴恩达：序列模型(Sequence Models)

1.1 为什么选择序列模型

x: Harry Potter and Hermione Granger invented a new spell.

y: 1 1 0 1 1 0 0 0 0

1.2 数学符号

1.3 循环神经网络Recurrent Neural Networks

RNN的限制是它在某一时刻的预测只来自于序列之前的输入信息

1.4 通过时间的反向传播

1.5 不同类型的循环神经网络

many-to-many: 多对多，识别姓名（、机器翻译

many-to-one：多对一，电影评级

one-to-one：一对一，简单神经网络

one-to-many：音乐生成

[图片上传失败...(image-7bfbcd-1627733070469)]

1.6 语言模型和序列生成

language model and sequence processing

1.7 新序列采样

sampling a sequence from a trained RNN

[图片上传失败...(image-a3d162-1627733070469)]

1.8 带有神经网络的梯度消失

gradient clipping梯度修剪：观察你的梯度向量，如果它大于某个阈值，缩放梯度向量，保证它不会太大。

1.9 GRU单元Gate Recurrent Unit

GRU改变了RNN的隐藏层，使其可以更好地捕捉深层连接，并改善了梯度消失问题。

GRU单元（门控循环神经网络）可以有效解决梯度消失的问题，并且能够使你的神经网络捕获更长的长期依赖。

GRU（simplified)

[图片上传失败...(image-5bc757-1627733070469)]

1.10 长短期记忆（LSTM）

long short term memory

[图片上传失败...(image-bb377a-1627733070469)]

[图片上传失败...(image-6eb33e-1627733070469)]

何时使用GRU还是LSTM没有统一的准则，GRU的优点是这是个更简单的模型，所以更容易创建一个更大的网络，而且它只有两个门，在计算性上也运行得更快，也可以扩大模型的规模。但是LSTM更加强大和灵活，因为它有三个门。

1.11 双向神经网络Bidirectional RNN

1.12 深层循环神经网络

[图片上传失败...(image-e9bde9-1627733070469)]

2.1 词汇表征 Word representation

visualizing word embeddings 可视化词嵌入

[图片上传失败...(image-d0afbd-1627733070469)]

2.2 使用词嵌入

2.3 词嵌入的特性

analogies using word vectors

cosine similarity

2.4 嵌入矩阵embedding matrix

2.5 学习词嵌入

2.6 Word2Vec

skip-grams

I want a glass of orange juice to go along with my cereal.

content c ("orange") —— target t ("juice")

6257 4834

分母部分的求和会很缓慢，解决方案是使用一个分级的softmax分类器（hierarchical softmax classifier)

2.7 负采样

[图片上传失败...(image-fdfc93-1627733070469)]

2.8 GloVe词向量 global vectors for word representation

[图片上传失败...(image-cff3d2-1627733070469)]

2.9 情绪分类 sentiment classification

[图片上传失败...(image-b61aea-1627733070469)]

2.10 词嵌入除偏

3.1 基础模型 basic models

3.2 选择最可能的句子

greedy search vs beam search

3.3 定向搜索 Beam search

3.4 改进定向搜索

3.5 定向搜索的误差分析

3.6 Bleu得分 bilingual evaluation understudy(s双语评估替补)

Bleu score on n-grams only

combined Bleu score:

[图片上传失败...(image-258f73-1627733070468)]

在图像描述应用中，对于同一图片的不同描述，可能是同样好的；或者对于机器翻译来说，有多个一样好的翻译结果，BLEU提供了一个能够自动评估的方法，帮助加快算法开发进程。

3.7 注意力模型直观理解 attention model intuition

[图片上传失败...(image-a593f2-1627733070468)]

3.8 注意力模型 attention model

[图片上传失败...(image-4b045-1627733070468)]

3.9 语音辨识 speech recognition

attention model for speech recognition

[图片上传失败...(image-dce0f7-1627733070468)]

CTC(connectionist temporaral classification) cost for speech recognition