1. three step for deep learning

define a set of function / neural network
goodness of function
pick the best function

2. neural network

2.1 neuron

image.png

2.2 sigmoid function

image.png

sigmoid function的值域为（0,1），常用于神经网络的阈值函数，将变量映射到0,1之间。

2.3 neural network

different connections leads to different network structure.
Each neurons can have different values of weights and biases.Weights and biases are network parameters θ.

2.4 fully connect feedforward network

image.png

Deep means many hidden layers.

2.5 output layer(option)

ordinary layer:

image.png

softmax layer:

image.png

In general,the output of network can be any value.May not be easy to interpet.
如果输出层是使用的softmax,输出结果可以认为是概率，因为满足：

1>yi>0
Σyi=1

2.6 FAQ

how many layers?How many neurons for each layer?
trial and error、intuition
Can structure be automatically determined?

3. goodness of function

3.1 training date

input date
their labels

3.2 learning target

根据输入数据，输出对应的label

3.3 loss

Loss can be the distance between the network output and target.
A good function should make the loss of all examples as small as possible.
total loss: L=Σli , li∈train date

4. pick the best function

How to pick the best function (find network parameters θ that minimize total loss L)?

Enumerate all possible values
Gradient Descent

4.1 Gradient Descent

image.png

Pick an initial value for w(Random,RBM pre-train)
Compute ∂L/∂w，negative→Increase w;Positive→Decrease w.
w←w-η∂L/∂w,η is called "learning rate".
Repeat 2,until ∂L/∂w is approximately small(when update is little)

4.2 avoid local minima

Gradient descent never guarantee global minima.There are some tips to help you avoid local minima,no guarantee.

5. Why Deep?

Deep is better.More parameters,better performance.（说明通过deep可以实现）
Universality Theorem:Any continuous function f

image.png

Can be realized by a network with one hidden layer(given enough hidden neurons).（说明通过fat可以实现）
从错误率来看，Thin+Tall is better than Fat+Short
通过deep,实现Modularization，需要更少的training data。例如，将一群动物分类，如果直接按照种分类的话，因为种很多，每个种都需要一定数量的training data；可以按照纲、目、科、属、种的顺序一层层分类，需要的training data就少一些。

Introduction of Deep Learning

1. three step for deep learning

2. neural network

2.1 neuron

2.2 sigmoid function

2.3 neural network

2.4 fully connect feedforward network

2.5 output layer(option)

2.6 FAQ

3. goodness of function

3.1 training date

3.2 learning target

3.3 loss

4. pick the best function

4.1 Gradient Descent

4.2 avoid local minima

5. Why Deep?

你可能感兴趣的:(Introduction of Deep Learning)