该系列仅在原课程基础上部分知识点添加个人学习笔记,或相关推导补充等。如有错误,还请批评指教。在学习了 Andrew Ng 课程的基础上,为了更方便的查阅复习,将其整理成文字。因本人一直在学习英语,所以该系列以英文为主,同时也建议读者以英文为主,中文辅助,以便后期进阶时,为学习相关领域的学术论文做铺垫。- ZJ
Coursera 课程 |deeplearning.ai |网易云课堂
转载请注明作者和出处:ZJ 微信公众号-「SelfImprovementLab」
知乎:https://zhuanlan.zhihu.com/c_147249273
CSDN:http://blog.csdn.net/junjun_zhao/article/details/79012494
4.1 Deep Neural Network (深层神经网络 )
(字幕来源:网易云课堂)
welcome to the fourth week of this course,by now you’ve seen forward propagation and back propagation in the context of a neural network with a single hidden layer,as well as logistic regression,and you’ve learned about vectorization,and why it’s important initialized the weight randomly,if you’ve done the past week’s homework,we’ve also implemented and seen some of these ideas work for yourself.so by now you’ve actually seen most of the ideas you need,what we’re going to do in this week is take those ideas to implement a deep neural network,and put them together so that you’ll,be able to implement your own deep neural network,because this we problem exercise is longer,and just has a bit more work going to keep the video so this week shorter,as you get to the videos a little bit more quickly.
欢迎听讲第四周的课程,到现在为止你应该见过单隐层神经网络中的前向和后向传播,以及 logistic 回归,并且也学了向量化,和为什么随机初始化比较重要,如果你已经完成了之前的作业,我们也已经亲自实现过一些理念,所以到现在你应该已经看过大多数的,实现深度神经网络的理念,这周我们将会学习如何把这些理念组合起来,从而实现你自己的深度神经网络模型。因为这周的练习题目会更长,可能会需要更多时间完成,这周的视频会稍微剪短一些,所以你可以更快一点看完视频。
and then I’ll have more time to do a significant program exercise at the end,which I hope will leave you having built a deep neural network that you feel proud of.so what is a deep neural network,you’ve seen this picture for logistic regression,and you’ve also seen neural networks with a single hidden layer,so here is an example of a neural network with two hidden layers,and a neural network with five hidden layers,we say that logistic regression is a very shallow model,whereas this model here is a much deeper model,and shallow versus depth is a matter of degree,so neural network of a single hidden layer,this would be a two layer neural network,remember when we count layers in neural network,we don’t count the input layer we just count the hidden layers,as well as the output layer.
就有更多的时间,在之后做一个编程大作业,希望你能够搭建一个能让你引以为傲的深度学习模型,好,那么什么是深度学习网络,你已经学过 logistic 回归,并且见过单隐层神经网络了,这里有一个双隐层神经网络,以及一个五隐层的神经网络的例子,我们说logistic 回归是一个浅层模型,而这里的这个模型深要深得多,浅层或是深层是一个程度的问题,所以单隐层神经网络,这是一个双层神经网络,要记住当我们数神经网络有几层的时候,我们不能把输入层数进去只算上隐层的数量,还算上输出层。
重点:
神经网络层数表示:只算 隐含层 和 输出层,不算 输入层。
so this would be a two layer neural network is still quite shallow,but not as shallow as logistic regression,technically logistic regression is a you know one layer neural network,but over the last several years the AI or the machine learning community has realized that there are functions,that very deep neural networks can learn,that shallower models are often unable to,although for any given problem it might be hard to predict in advance exactly how deep a neural network you would want,so it would be reasonable to try logistic regression,try one and then two hidden layers,and view the number of hidden layers as another hyper parameter,that you could try a variety of values of,and evaluate on hold-out cross validation data,or on your development set,say more about that later as well.
那么这就是个双层神经网络 还是比较浅的,但 logistic 回归更浅,技术层面上说 logistic 回归是单层神经网络,但是前几年在人工智能或机器学习社区中,大家发觉有些函数,只有非常深层的神经网络能够学习,而浅一些的模型通常无法学习,虽然处理任何具体问题的时候都会很难预先准确地判断需要多深的神经网络,所以先试试看 logistic 回归是非常合理的做法,试一下单层然后两层,然后把隐层数量当成另一个,可以自由选择数值大小的超参数,然后在保留交叉验证数据上评估,或者用你自己的开发集评估,后面也会讲到这个。
let’s now go through the notation we used to describe deep neural networks,here is a one two three four layer neural network with three hidden layers,and the number of units in these hidden layers are i guess five five three,and then there’s one output unit,so the notation we’re going to use it’s going to use capital L,to denote the number of layers in the network,so in this case L is equal to four and,so that’s the number of layers and,we’re going to use n superscript l to denote the number of nodes or the number of units in layer lowercase l,so if we index this the input as layer 0,this is layer 1 this is layer 2 this is layer 3 and this is layer 4,then we have that for example n[1] ,that would be this the first hidden layer would be equal to 5,because we have 5 hidden units there.
那我们现在来过一遍,用来描述深度神经网络的符号,这里是一个一二三四,四层的有三个隐层的神经网络,然后隐层中的单元数目 是五 五 三,然后有一个输出单元,那么我们要用的符号是大写的 L ,用来表示神经网络的层数,那这里 L 等于 4,也就是层数,然后我们用 n 上标 l 来表示,节点的数量 或者小l层上的单元数量,所以当我们把输入层标为第 0 层的话,这是第一层 这是第二层 这是第三层 然后这是第四层,之后我们就有 举个例子 n[1] ,也就是第一个隐层 单元数等于 5,因为这儿我们有 5 个隐藏单元。
for this one we have the n 2 the number of units in the second set,in there is also equal to 5, n[3] is equal to 3 and n 4 which is n[4] ,this number of units is this number of output units is equal to one,because here our capital L is equal to 4,and we’re also going to have here that so the input layer, n[0] is just equal to n_x is equal to 3,okay so that’s the notation we’ll use to describe,the number of nodes we have in different layers,for each layer l also also going to,use a[l] to denote the activations in layer l,so we’ll see later that in forward propagation you end up computing, a[l] as the activation g apply to z[l] ,perhaps the activation is indexed by the layer l as well,and then we’ll use W[l] to denote the weights for computing the values z[l] in the a[l] ,and similarly b[l] is used to compute z[l] .
再一个我们来看看 n[2] 也就是第二隐层的单元数,这里也等于 5, n[3] 等于 3 以及 n[4] 也就是 n[4] ,这个输出单元数等于 1,因为这里我们的 L 等于 4,那我们也得有输入层, n[0] 也就是 nx 等于 3,好咯 那么这些就会是我们之后会用到的,用来描述各个层中节点数的符号,对于各个第 l 层,会用 a[l] 来表示 l 层中的激活函数,那么我们一会儿就会看到在前向传播中你最后要算的, a[l] 是激活函数 g(z[l]) ,激活函数也会用层数 l 来标注,然后我们用 W[l] 来表示,在 a[l] 中计算 z[l] 值的权重, z[l] 方程里的 b[l] 也一样。
重点:
L = 4 (# layers)
n[l] = # units in layer l
hidden layer : n[1]=5,n[2]=5,n[3]=3 隐含层
output layer : n[4]=n[L]=1 输出层
input layer: n[0]=nx=3 输入层
a[l] =activations in layer l 激活函数层
a[l]=g[l](Z[l])
W[l]= weights for Z[l]
b[l] is used to compute z[l]
x=a[0] 输入特征 x 也是第 0 层 激活函数
y^=a[L] 预测输出也是激活函数第 l 层
finally just to wrap up on the notation,the input features are called x,but x is also the activations of layer 0,so a[0] is equal to x and the activation of the final layer,a capital L is equal to Y hat.so a superscript bracket capital L is equal to the predicted output to prediction y hats of the neural network.so you now know what a deep neural network looks like,as well as the notation we’ll use to describe and to compute for deep network.I know I’ve introduced a lot of notation in this video,but if you ever forget what some symbol means.we’ve also posted on the course website a notation sheet or notation guide you can use to look up,what these different symbols means,next I like to describe what forward propagation,in this type of network looks like.let’s go into the next video.
最后总结一下符号约定,输入特征用 x 表示,但 x 也是第 0 层的激活函数,那么 a[0] 等于 x,最后一层的激活函数, a[0] 等于 y^ ,就是说 a[L] 等于预测输出,也就是这个神经网络预测出来的 y^ ,那么现在你知道一个深度神经网络是啥样的了,以及我们在深度网络中会用到的,以及用来计算的符号约定,好啦我知道这个视频里我讲了好多符号,如果你偶尔忘记一些符号的意思,我们也在课程网站上贴了解释符号约定的资料,你就可以去暗中观察,这些不同的符号表示什么,接下来我会描述一下在这类网络中,前向传播是啥样的,且听下回分解。
神经网络层数表示:只算 隐含层 和 输出层,不算 输入层。
符号约定:
L = 4 (# layers)
n[l] = # units in layer l
hidden layer : n[1]=5,n[2]=5,n[3]=3 隐含层
output layer : n[4]=n[L]=1 输出层
input layer: n[0]=nx=3 输入层
a[l] =activations in layer l 激活函数层
a[l]=g[l](Z[l])
W[l]= weights for Z[l]
b[l] is used to compute z[l]
x=a[0] 输入特征 x 也是第 0 层 激活函数
y^=a[L] 预测输出也是激活函数第 l 层
参考文献:
[1]. 大树先生.吴恩达Coursera深度学习课程 DeepLearning.ai 提炼笔记(1-4)– 浅层神经网络
PS: 欢迎扫码关注公众号:「SelfImprovementLab」!专注「深度学习」,「机器学习」,「人工智能」。以及 「早起」,「阅读」,「运动」,「英语 」「其他」不定期建群 打卡互助活动。