当我们计算输出节点的增量时,这里出现了一些不同之处。
The difference arises when we calculate thedelta of the output node as:
e = d - y;
delta = e;
与前面的示例代码不同,这里不再使用sigmoid函数的导数。
Unlike the previous example code, thederivative of the sigmoid function no longer exists.
这是因为,对于交叉熵函数的学习规则,如果输出节点的激活函数是sigmoid,那么增量等于输出误差。
This is because, for the learning rule ofthe cross entropy function, if the activation function of the output node isthe sigmoid, the delta equals the output error.
当然,隐藏节点遵循以前的反向传播算法中使用的所有相同过程。
Of course, the hidden nodes follow the sameprocess that is used by the previous back-propagation algorithm.
e1 = W2’*delta;
delta1 = y1.*(1-y1).*e1;
以下程序清单为TestBackpropCE.m文件的源代码,用于测试BackpropCE函数。
The following program listing shows theTestBackpropCE.m file, which tests the BackpropCE function.
该程序调用BackpropCE函数并训练神经网络10000次。
This program calls the BackpropCE functionand trains the neural network 10,000 times.
训练后的神经网络将输出结果显示在屏幕上。
The trained neural network yields theoutput for the training data input, and the result is displayed on the screen.
我们通过比较训练输出与正确的输出来验证神经网络是否被有效地训练。
We verify the proper training of the neuralnetwork by comparing the output to the correct output.
由于这里的代码与以前几乎一致,因此不再赘述。
Further explanation is omitted, as the codeis almost identical to that from before.
clear all
X = [ 0 0 1;
0 1 1;
1 0 1;
1 1 1;
];
D = [ 0 1 1 0];
W1 = 2*rand(4, 3) - 1;
W2 = 2*rand(1, 4) - 1;
for epoch = 1:10000 % 训练
[W1W2] = BackpropCE(W1, W2, X, D);
end
N = 4; % 推断
for k = 1:N
x = X(k, :)';
v1 = W1*x;
y1 = Sigmoid(v1);
v = W2*y1;
y = Sigmoid(v)
end
执行以上代码输出以下显示的值。
Executing this code produces the valuesshown here.
训练输出非常接近正确的输出D。
The output is very close to the correctoutput, D.
从而证明了该神经网络已经被成功训练。
This proves that the neural network hasbeen trained successfully.
代价函数比较(Comparison of Cost Functions)
上一节中的BackpropCE函数与“XOR问题”一节中BackpropXOR函数之间的唯一区别是输出节点增量的计算。
The only difference between the BackpropCEfunction from the previous section and the BackpropXOR function from the “XORProblem” section is the calculation of the output node delta.
我们将研究这种微不足道的差异将如何影响学习性能。
We will examine how this insignificantdifference affects the learning performance.
下面的程序清单显示了CEvsSSE.m中的内容,该程序用于比较两个函数的平均误差。
The following listing shows the CEvsSSE.mfile that compares the mean errors of the two functions.
——本文译自Phil Kim所著的《Matlab Deep Learning》