什么是神经网络?
该系列仅在原课程基础上部分知识点添加个人学习笔记,或相关推导补充等。如有错误,还请批评指教。在学习了 Andrew Ng 课程的基础上,为了更方便的查阅复习,将其整理成文字。因本人一直在学习英语,所以该系列以英文为主,同时也建议读者以英文为主,中文辅助,以便后期进阶时,为学习相关领域的学术论文做铺垫。- ZJ
Coursera 课程 |deeplearning.ai |网易云课堂
转载请注明作者和出处:微信公众号-「SelfImprovementLab」
知乎:https://zhuanlan.zhihu.com/c_147249273
CSDN:http://blog.csdn.net/JUNJUN_ZHAO/article/details/78747702
:http://www.jianshu.com/p/b82afdc49f25
掘金:https://juejin.im/post/5a29dacc6fb9a044ff315de6
Github:https://github.com/laobadao
1.2 What is a Neural Network?
什么是神经网络
The term deep learning refers to training neural networks.Sometimes very large neural networks.So what exactly is a neural network? In this video, let's try to give you some of the basic intuitions.Let's start with a housing price prediction example.
“深度学习”指的是训练神经网络,有时候规模很大,那么神经网络究竟是什么呢?在这个视频中 我会讲些直观的基础知识。
Let's say you have a data sets with six houses.So you know the size of the houses in square feet or square meters,and you know the price of the house,and you want to fit a function to predict the price of the house as a function of the size.So if you are familiar with linear regression,you might say, well,let's fit a straight line to this data.
我们从一个房价预测的例子开始,假设有一个六间房屋的数据集。已知房屋的面积单位是平方英尺或平方米,已知房屋价格,想要找到一个函数,根据房屋面积 预测房价的函数。如果你懂线性回归,你可能会说 好吧,用这些数据来拟合一条直线。
So maybe you got a straight line like that.But to be a bit fancier, you might say, well,we know that prices can never be negative.So instead of a straight line fit which would eventually become negative,let's bend the curve here so it just ends up zero here.So this thick blue line ends up being your function for predicting the price of a house as a function of its size.Whereas, zero here and then there's a straight line fit to the right.
于是你可能会得到这样一条直线。但奇怪的是你可能也知道,价格永远不会为负。因此,直线不大合适,它最后会让价格为负,我们在这里弯曲一点让它结束于0,这条粗的蓝线就是你要的函数,根据房屋面积预测价格,这里是 0,这里的直线拟合得很好。
You can think of this function that you just fit to housing prices as a very simple neural network.It's almost the simplest possible neural network.Let me draw it here.We have as the input to the neuro network, the size of the house,which we will call x.It goes into this node, this little circle.And then it outputs the price we call y.So this little circle which is a single neuron,and then your network implements this function that we drew on the left. And all the neuron does is it inputs the size, computes this linear function,takes max of zero, and then outputs the estimated price.
你也许可以把这个房屋加个拟合函数,看成是一个非常简单的神经网络。这几乎是最简单的神经网络了。让我画在这里,我们把房屋的面积,作为神经网络的输入称之为 x。通过这个节点,这个小圈圈,最后输出了价格,用 y 表示。这个小圆圈就是一个独立的神经元,你的网络实现了左边这个函数的功能,这个神经元所做的,就是输入面积,完成线性运算取不小于 0 的值,最后得到输出预测价格。
And in the neuro network literature, you see this function a lot.This function which goes at zero for some time and then takes off as a straight line.This function is called a ReLU function,which stands for rectified linear unit, so ReLU.And rectify just means taking a max of zero,which is why you get a function shaped like this.
神经网络的文献中,经常看得到这个函数。这个函数一开始是 0,然后就是一条直线,这个函数被称作 ReLU 函数,全称是“修正线性单元”,即 ReLU “修正”指的是取不小于 0 的值,这就是这个函数长这样的原因。
You don't need to worry about ReLU units for now,but it's just something you see again later in this course.So if this is a single neural network,maybe a tiny little neural network, a larger neural network is then formed by taking many of these single neurons and stacking them together. So if you think of this neuron is being like a single Lego brick,you then get a bigger neural network by stacking together many of these Lego bricks.
不理解ReLU函数的话,不用担心这门课的后面还会看到它的。这是一个单神经元网络,规模很小的神经网络,大一点的神经网络是把这些单个神经元堆叠起来形成的。你可以把这些神经元想象成单独的乐高积木,你通过搭积木来构建一个更大的神经网络。
Let's see an example. Let's say that instead of predicting the price of a house just from its size, you now have other features, you know other things about the house,such as the number of bedrooms, shall we write this bedrooms.And you might think that one of the things that really affects the price of a house is family size.So can this house fit your family of three or family of four, or family of five And it's really based on the size in square feet or square meters,and the number of bedrooms that determines whether or not a house can fit your family's family size.
来看一个例子。不仅仅用房屋的面积来预测价格,现在你还有一些房屋的其它特征,知道了一些别的信息,比如卧室的数量,这样记下来。你可能想到有一个很重要的因素会影响房屋价格,就是“家庭人数”。这个房屋能住下一个三口之家、四口之家、或者五口之家,这个性质和面积大小相关还有卧室的数量,能否满足住户的家庭人数需求。
And then maybe you know the zip codes in different countries is called the postal code. So is this neighborhood highly walkable?And the zip code maybe as a feature that tells you walkability.Can you just walk to the grocery store and then walk to school?Do you need to drive?Some people prefer highly walkable neighborhoods.And then the zip code, as well as the wealth maybe,tells you, certainly in the United States.But some other countries as well tells you how good is the school quality.So each of these little circles I'm drawing can be one of those ReLU,rectified linear unit, or some other slightly nonlinear function so that based on the size and number of bedrooms,you can estimate the family size.With the zip code, estimate walkability,based on zip code as well you can estimate the school quality.
你可能知道邮编,在一些国家,也被叫作邮政编码。邮编或许能作为一个特征说明了步行化程度,这附近是不是高度步行化?你是否能步行去杂货店,或者是学校是否需要开车?有些人喜欢高度步行化的地方。另外根据邮政编码,还有富裕程度,在美国是这样的,其它国家也可能一样。邮编体现了,附近学校的质量。我画的每一个小圈圈都可能是一个ReLU 即“修正线性单元”。或者其它的不那么线性的函数,基于房屋面积和卧室数量,你可以估算家庭人口,基于邮编可以评估步行化程度,基于邮编也可以评估学校质量。
And then, finally, you might think that,well, the way people decide how much they're willing to pay for a house is they look at the things that really matter to them.In this case, family size, walkability and school quality and that helps you predict the price.So in this example, x is all of these four inputs and y is the price you're trying to predict.
最后,你可能会想,人们愿意在房屋上花费多少钱和他们关注什么息息相关。在这个例子,家庭人口、步行化程度以及学校质,都能帮助你预测房屋的价格。在这个例子中 x 是所有的这四个输入 y 是预测的价格。
And so by stacking together a few of the single neurons or the simple predictors we had from the previous slide,we now have a slightly larger neuro network.Part of the magic of a neural network is that when you implement it,you need to give it just the input, x.And the output y, for number of examples in your training set,and all these things in the middle it will figure out by itself.So what you actually implement is this,where here you have a neural networks with four inputs.So the input the features might be the size number of bedrooms,the zip code or postal codes, and the wealth of the neighborhood.And so given these input features,the job of the neural network will be to predict the price y.And notice also that each of these circles,these are called hidden units in a neural network,that each of them takes its input of four input features.
把这些独立的神经元叠加起来,在上一张幻灯片里面出现的简单的预测器,神经元,现在有了一个稍微大一点的神经网络。神经网络的一部分神奇之处在于当你实现它之,你要做的只是输入 x 。就能得到输出 y ,不管训练集有多大,所有的中间过程它都会自己完成,那么你实际上做的就是这有四个输入的神经网络输入的特征 ,可能是卧室的数量,邮政编码,和周边的富裕程度。已知这些输入的特征,神经网络的工作就是预测对应的价格,同时也注意到这些圈圈,在一个神经网络中它们被叫做“隐藏单元”,每个的输入都同时来自四个特征。
So for example, rather than saying this first nodes represents family size and family size depends only on the features x1 and x2, instead,we're going to say,well, neural network, you decide whatever you want this node to be and we'll give you all four input features to compute whatever you want.So we say that the layers is this input layer and this layer in the middle,the neural network, are densely connected because every input feature's connected to everyone of these circles in the middle.
比如说,我们不会具体说,第一个节点表示家庭人口或者说家庭人口仅取决于特征 x1 和 x2,我们会这么说,神经网络你自己决定这个节点是什么,我们只给你四个输入特征,随便你怎么计算。因此我们说这一层,输入层,在中间的这一层,在神经网络中连接数是很高的,因为输入的每一个特征都连接到了中间的每个圈圈。
And the remarkable thing about neural networks is that given enough data about x and y, given enough training examples with both x and y, neural networks are remarkably good at figuring out functions that accurately map from x to y.So that's a basic neural network.It turns out that as you build out your own neural networks you probably find them to be most useful, most powerful,in supervised learning settings.Meaning that you're trying to take an input x and map it some output y,like we just saw in the housing price prediction example.In the next video, let's go over some more examples of supervised learning,and some examples of where you might find your networks to be incredibly helpful for your applications, as well.
值得注意的是神经网络只有你喂给它足够多的数据,关于x和y的数据,给到足够的 x、y 训练样本,神经网络非常擅长于,计算从 x 到 y 的精准映射函数。这就是一个基本的神经网络,你可能发现 自己的神经网络在监督学习的环境下是如此有效和强大。也就是说,你只要尝试输入一个 x, 即可把它映射成 y ,像我们在刚才房价预测的例子中看到的,在下一个视频中,你会看到更多监督学习的例子,有些例子会让你觉得,你的神经网络对你的应用场合非常有帮助。
PS: 欢迎扫码关注公众号:「SelfImprovementLab」!专注「深度学习」,「机器学习」,「人工智能」共同早起,学习,进步。