神经网络与深度学习笔记(二)python 实现随机梯度下降

# 随机梯度下降函数
# training_data 是一个 (x, y) 元组的列表,表示训练输入和其对应的期# 望输出。
# 变量 epochs 迭代期数量
# 变量 mini_batch_size 采样时的小批量数据的大小。
# eta 是学习速率, η
# 如果给出了可选参数 test_data,那么程序会在每个训练器后评估网络,并# 打印出部分进展

def SGD(self, training_data, epochs, mini_batch_size, eta,
        test_data=None):
    """Train the neural network using mini-batch stochastic
    gradient descent.  The "training_data" is a list of tuples
    "(x, y)" representing the training inputs and the desired
    outputs.  The other non-optional parameters are
self-explanatory. If "test_data" is provided then the network will be evaluated against the test data after each epoch, and partial progress printed out. This is useful for tracking progress, but slows things down substantially."""
    # 获得测试集大小
    if test_data: 
        n_test = len(test_data)
    n = len(training_data)
    for j in xrange(epochs):
        # 打乱训练集
        random.shuffle(training_data)
        # 获得最小数据集 mini_batch
        mini_batches = [
           training_data[k:k+mini_batch_size]
        for k in xrange(0, n, mini_batch_size)]
        # 
        for mini_batch in mini_batches:
            # 训练模型
            self.update_mini_batch(mini_batch, eta)
            # 打印跟踪
            if test_data:
                print "Epoch {0}: {1} / {2}".format(
                j, self.evaluate(test_data), n_test)
            else:
                print "Epoch {0} complete".format(j)    


def update_mini_batch(self, mini_batch, eta):
"""Update the network's weights and biases by applying gradient descent using backpropagation to a single mini batch. The "mini_batch" is a list of tuples "(x, y)", and "eta"
     is the learning rate."""
   nabla_b = [np.zeros(b.shape) for b in self.biases]
nabla_w = [np.zeros(w.shape) for w in self.weights]
for x, y in mini_batch:
    # 利用反向传播算法,快速计算代价函数的梯度
    delta_nabla_b, delta_nabla_w = self.backprop(x, y)
    nabla_b = [nb+dnb for nb, dnb in zip(nabla_b, delta_nabla_b)]
    nabla_w = [nw+dnw for nw, dnw in zip(nabla_w, delta_nabla_w)]
    # 利用梯度下降算法公式
    self.weights = [w-(eta/len(mini_batch))*nw
                    for w, nw in zip(self.weights, nabla_w)]
    self.biases = [b-(eta/len(mini_batch))*nb
                    for b, nb in zip(self.biases, nabla_b)]

你可能感兴趣的:(神经网络和深度学习)