

Biological Neuron (neurone)生物神经元

  • A neuron, also known as a nerve cell, communicates with other cells via
    specialized connections called synapses.
  • A neuron consists of a cell body (soma), dendrites, and a single axon.
  • Most neurons receive signals via the dendrites and soma, and send out signals down the axon. 神经元通过树突和体细胞接收信号,并沿轴突发出信号。
  • At the majority of synapses, signals cross from the axon of one neuron to a dendrite of another.
  • Synapses can connect an axon to another axon or a dendrite to another dendrite.突触可连接一个轴突到另一个轴突,或连接一个树突到另一个树突。

Working States of Biological Neuron

Neurons have two normal working states. 神经元有两种常规工作状态:

Excited state兴奋状态 Inhibitory state 抑制状态
when the afferent nerve impulse causes the cell membrane potential to exceed the action potential threshold, then the cell enters the excited state, produces the nerve impulse, and outputs by the axon; 当传入的神经冲动使细胞膜电位超过动作电位的阈值 ,则细胞进入兴奋状态,产生神经冲动,并由轴突输出; when the afferent nerve impulse decreases the cell membrane potential to a lower value than the threshold of the action potential, the cell enters the inhibitory state and has no nerve impulse output.当传入的神经冲动使细胞膜电位下降低于动作电位的阈值,则细胞进入抑制状态,没有神经冲动输出。
  • Learning and forgetting: synaptic transmission can be enhanced and weakened due to the plasticity of neuronal structures.

Mathematical Model for Neuron 神经元数学模型

  • In 1943, McCulloch(麦克洛奇)and Pitts (皮兹) proposed M-P Model.

McCulloch: Neurologist & anatomist mathematician
Pitts: mathematician

  • The unit j’s output activation is aj , where ai is the output activation of unit i and wi,j is the weight on the link from unit i to unit j. 单元 j 的输出激活是 aj , 公式(1)中 ai 是单元i 的输出激活,wi,j是从单元 i 到单元 j 的链上的权重

Artificial Neuron 人工神经元

Activation Function 激活函数

  • Artificial neurons loosely model the neurons in a biological brain. ANN is a system constructed by interconnecting “artificial neurons” . An ANN is based on a collection of connected units or nodes called artificial neurons,
  • An ANN tries to simulate the learning processing of human brain. Each connection, like the synapses in a biological brain, can transmit a signal from one artificial neuron to another.
  • An artificial neuron that receives a signal can process it and then signal additional artificial neurons connected to it.
  • In common ANN implementations, the signal at a connection between artificial neurons is a real number, and the output of each artificial neuron is computed by some non-linear function of the sum of its inputs.
  • The connections between artificial neurons are called ‘edges’ which have a weight that can be adjusted as learning proceeds. The weight increases or decreases the strength of the signal at a connection.
  • Artificial neurons may have a threshold such that the signal is only sent if the aggregate signal crosses that threshold. 人工神经元也可设置阈值,仅当聚合信号超过该阈值时才发送信号(模拟:兴奋、抑制状态)。

Artificial Neural Networks

  • Typically, artificial neurons are aggregated into layers. Neurons in different layers are connected to each other to form an artificial neural network.
  • Signals travel from the first layer (the input layer), to the last layer (the output layer), possibly after traversing the layers multiple times.
  • The original goal of the ANN approach was to solve problems in the same way that a human brain would.
  • However, over time, attention moved to performing specific tasks, leading to deviations from biology.
  • ANN have been used on a variety of tasks, including computer vision, speech recognition, machine translation, social network filtering, playing board and video games and medical diagnosis.
  • it is thus obvious that ANN itself is not an algorithm, but rather a framework for many different machine learning algorithms to work together and process complex data inputs.


  • Perceptron, proposed by Frank Rosenblatt in 1957, is a binary linear classifier.
  • A network with all the inputs connected directly to the outputs is called a single-layer neural network, or a perceptron. 所有输入直接连接到输出的网络称为单层神经网络,也称为感知机.
  • perceptron has only one set of input units and one output unit.
    感知机只有一组输入单元 (输入层) 和一个输出单元(输出层)。
    - Disadvantage: Perceptron can only solve the linear separable
    classification problem.
  • Solving method:increase the number of layers of perceptron in
    order to make it solve complex problem. 增加层数

1985:Multilayer Neural Network (MNN) 多层神经网络


  • In addition to the input layer and the output layer, introduce the intermediate layer, called the hidden layer. There can be multiple hidden layers.
  • The output of each layer cell is the input of the next layer cell.
  • The hidden layer, like its name, does not deal directly with the external environment, and the number of the hidden layers can be from zero to several layers.
    -When the number of NN layers is counted, the input layer is not counted, but the number of hidden layer and output layer is only counted, is called the depth of NN.
  • When the number of layers of NN reaches a certain number, it is called deep neural network (DNN). 当神经网络层数达到一定数量时,称之为深度神经网络。
  • MNN can solve nonlinear separable problems. MNN可用于解决非线性可分问题!
  • NN with multiple output units can solve the problem of multiple classification.
  • Suppose there are n samples, m categories C1,…,Cm .When constructing neural network, n input units and m output units need to be designed.
  • When the class of a sample is Ci, the desired output value of the ith output unit is 1, and the desired output value of the other output units is 0.

Shallow vs. Deep Neural Network 浅层与深层神经网络

  • There is no universally agreed upon threshold of depth dividing shallow neural networks from deep neural networks. 就划分浅层神经网络与深层神经网络的深度而言,尚未有公认的观点。
  • But most researchers agree that deep neural networks have more than 2 of hidden layers, and hidden layers > 10 to be very deep neural networks.

Activation Function (激活函数)

  • Activation function, also called transfer function or output transformation function.
Why need activation function? 为什么要用激活函数
  • The activation function of the early artificial neuron is used to simulate the action potential (threshold) in biology. If the cell membrane potential exceeds it, the cell is in the excited state and outputs the signal, otherwise the cell is in the suppressed state and does not output the signal. Therefore, step function is adopted.
  • Later, when the single-layer NN is extended to multi-layer NN, it is found that the output of each layer is a linear function of the upper input. No matter how many layers NN has, the output is a linear combination of inputs. The linear model can not solve the nonlinear problem.
  • Therefore, the nonlinear activation function is introduced to the neuron in order to transform the linearity into nonlinearity. NN can approach any nonlinear function, so that NN can be used to solve the nonlinear problem.
5 types of often used activation functions (5种常用的激活函数)

(1) Linear:Threshold Function (i.e. jump function, 阈值函数, 即阶跃函数)
(2) Non-linear Function:ReLU ,Logistic-Sigmoid,Tanh-Sigmoid,Softmax

Logistic-Sigmoid function 逻辑S形函数


Softmax Function
  • Softmax function, or normalized exponential function, make each
    output value is between (0, 1) and the sum of all elements is 1.
    也称为称归一化指数函数,使得每个输出值在 (0,1)之间,且所有元素和为1。
  • It is applied to the output layer for multi-class NN, normalizing the
    output results to probability distribution.
  • Its general meaning is to normalize the output vector, highlight the
    maximum value and suppress the other components that are far
    below the maximum value.
  • For example, the input vector [1, 2, 3, 4, 1, 2, 3] corresponds to the softmax function value [0.024,0.064, 0.175, 0.475,0.024, 0.064, 0.175]. The item with the maximum weight in the output vector corresponds to the maximum value of “4” in the input vector.
    例如:输入向量[1,2,3,4,1,2,3]对应的 Softmax 函数值为[0.024,0.064,0.175,0.475,0.024,0.064,0.175]。输出向量中有最大权重的项对应着输入向量中的最大值“4”。
  • Sigmoid is a special case of Softmax. When the number of classes is 2, Sigmoid is Softmax.
    Sigmoid 是 softmax的特例。当类别数为2时, Sigmoid 就是 softmax。
  • Sigmoid is used to solve the problem of binary classification, and
    Softmax is used to solve the problem of multi-classification.
    Sigmoid 用于解决二分类问题,而 softmax 用于解决多分类问题。
  • When the number of classes is 2, the fully connected NN without
    hidden layer becomes logistic regression.


  • The multi-class of Softmax is mutually exclusive, that is, an input can only be classified into one class;
    Softmax 的多类别间是互斥的,即一个输入只能被归为一类;
  • Multiple logistic regression can also implement multiple classifications, but the output categories are not mutually exclusive. “Apple” is also “fruit”.

History of Artificial Neural Networks 人工神经网络的发展史


  • But at that time, the research of the neural network was in the trough
    period(1969-1982), so BP algorithm did not draw great attention. 但当时,神经网络的研究正处于低谷期,故BP算法并未引起重视。
  • BP algorithm regenerated interest until the research of Neural Network ushered in a second climax in the 1980s(1983~1990).
  • In 1985, multi-layer ANN appeared, which broke through the limitation of early perceptron.
  • In 1986, Rumelhart(American psychologist)and Geoffrey Hinton
    independently put forward the learning algorithm of MNN–BP algorithm.
    1986年,Rumelhart 与Hinton等人重新独立地提出了多层神经网络的学习算法—BP算法。(人工智能三大奠基人Geoffrey Hinton、Yann LeCun 与 Yoshua Bengio


Year Who Event
1943 Warren McCulloch &Walter Pitts 神经网络开山之作
1958 Frank Rosenblatt “感知机”模型
1959 Hubel&Wiesel Visual cortex
1969 Marvin Lee Minsky & Seymour Papert 指出“感知机”缺陷
1982 John Hopfield 霍普菲尔德网络
1986 Rumelhart & McCelland 反向传播
1995 Vapnik SVM
2006 Geoffery Hinton 深度学习
2009 李飞飞 ImageNet
2012 吴恩达 谷歌大脑
