w_x_w1985

深度学习之卷积神经网络(CNN)详解与代码实现（一）

卷积神经网络(CNN)详解与代码实现

本文系作者原创，转载请注明出处:https://www.cnblogs.com/further-further-further/p/10430073.html

1.应用场景

2.卷积神经网络结构

2.1 卷积（convelution）

2.2 Relu激活函数

2.3 池化（pool）

2.4 全连接（full connection）

2.5 损失函数（softmax_loss）

2.6 前向传播（forward propagation）

2.7 反向传播（backford propagation）

2.8 随机梯度下降（sgd_momentum）

3.代码实现流程图以及介绍

4.代码实现（python3.6）

5.运行结果以及分析

6.参考文献

1.应用场景

卷积神经网络的应用不可谓不广泛，主要有两大类，数据预测和图片处理。数据预测自然不需要多说，图片处理主要包含有图像分类，检测，识别，以及分割方面的应用。

图像分类：场景分类，目标分类

图像检测：显著性检测，物体检测，语义检测等等

图像识别：人脸识别，字符识别，车牌识别，行为识别，步态识别等等

图像分割：前景分割，语义分割

2.卷积神经网络结构

卷积神经网络主要是由输入层、卷积层、激活函数、池化层、全连接层、损失函数组成，表面看比较复杂，其实质就是特征提取以及决策推断。

要使特征提取尽量准确，就需要将这些网络层结构进行组合，比如经典的卷积神经网络模型AlexNet:5个卷积层+3个池化层+3个连接层结构。

2.1 卷积（convolution）

卷积的作用就是提取特征，因为一次卷积可能提取的特征比较粗糙，所以多次卷积，以及层层纵深卷积，层层提取特征（千万要区别于多次卷积，因为每一层里含有多次卷积）。

这里可能就有小伙伴问：为什么要进行层层纵深卷积，而且还要每层多次？

你可以理解为物质A有自己的多个特征（高、矮、胖、瘦、、、），所以在物质A上需要多次提取，得到不同的特征，然后这些特征组合后发生化学反应生成物质B，

而物质B又有一些新的专属于自己的特征，所以需要进一步卷积。这是我个人的理解，不对的话或者有更形象的比喻还请不吝赐教啊。

在卷积层中，每一层的卷积核是不一样的。比如AlexNet

第一层：96*11*11（96表示卷积核个数，11表示卷积核矩阵宽*高） stride（步长） = 4 pad（边界补零） = 0

第二层：256*5*5 stride（步长） = 1 pad（边界补零） = 2

第三，四层：384*3*3 stride（步长） = 1 pad（边界补零） = 1

第五层：256*3*3 stride（步长） = 1 pad（边界补零） = 2

卷积的篇幅说了这么多，那么到底是如何进行运算的呢，虽说网络上关于卷积运算原理铺天盖地，但是个人总感觉讲得不够透彻，或者说本人智商有待提高，

希望通过如下这幅图（某位大神的杰作）来使各位看官们能够真正理解。

这里举的例子是一个输入图片（5*5*3），卷积核（3*3*3），有两个（Filter W0，W1），偏置b也有两个（Bios b0，b1），卷积结果Output Volumn（3*3*2），步长stride = 2。

输入：7*7*3 是因为 pad = 1 （在图片边界行和列都补零，补零的行和的数目是1），

（对于彩色图片，一般都是RGB3种颜色，号称3通道，7*7指图片高h * 宽w）

，补零的作用是能够提取图片边界的特征。

卷积核深度为什么要设置成3呢？这是因为输入是3通道，所以卷积核深度必须与输入的深度相同。至于卷积核宽w，高h则是可以变化的，但是宽高必须相等。

卷积核输出o[0,0,0] = 3 （Output Volumn下浅绿色框结果），这个结果是如何得到的呢？其实关键就是矩阵对应位置相乘再相加（千万不要跟矩阵乘法搞混淆啦）

=> w0[:,:,0] * x[:,:,0]蓝色区域矩阵(R通道) + w0[:,:,1] * x[:,:,1]蓝色区域矩阵（G通道）+ w0[:,:,2] * x[:,:,2]蓝色区域矩阵（B通道） + b0（千万不能丢，因为 y = w * x + b）

第一项 => 0 * 1 + 0 * 1 + 0 * 1 + 0 * (-1) + 1 * (-1) + 1 * 0 + 0 * (-1) + 1 * 1 + 1 * 0 = 0

第二项 => 0 * (-1) + 0 * (-1) + 0 * 1 + 0 * (-1) + 0 * 1 + 1 * 0 + 0 * (-1) + 2 * 1 + 2 * 0 = 2

第三项 => 0 * 1 + 0 * 0 + 0 * (-1) + 0 * 0 + 2 * 0 + 2 * 0 + 0 * 1 + 0 * (-1) + 0 * (-1) = 0

卷积核输出o[0,0,0] = > 第一项 + 第二项 + 第三项 + b0 = 0 + 2 + 0 + 1 = 3

o[0,0,1] = -5 又是如何得到的呢？

因为这里的stride = 2 ，所以输入的窗口就要滑动两个步长，也就是红色框的区域，而运算跟之前是一样的

第一项 => 0 * 1 + 0 * 1 + 0 * 1 + 1 * (-1) + 2 * (-1) + 2 * 0 + 1 * (-1) + 1 * 1 + 2 * 0 = -3

第二项 => 0 * (-1) + 0 * (-1) + 0 * 1 + 1 * (-1) + 2 * 1 + 0 * 0 + 2 * (-1) + 1 * 1 + 1 * 0 = 0

第三项 => 0 * 1 + 0 * 0 + 0 * (-1) + 2 * 0 + 0 * 0 + 1 * 0 + 0 * 1 + 2 * (-1) + 1 * (-1) = - 3

卷积核输出o[0,0,1] = > 第一项 + 第二项 + 第三项 + b0 = (-3) + 0 + (-3) + 1 = -5

之后以此卷积核窗口大小在输入图片上滑动，卷积求出结果，因为有两个卷积核，所有就有两个输出结果。

这里小伙伴可能有个疑问，输出窗口是如何得到的呢？

这里有一个公式：输出窗口宽 w = (输入窗口宽 w - 卷积核宽 w + 2 * pad)/stride + 1 ，输出高 h = 输出窗口宽 w

以上面例子，输出窗口宽 w = ( 5 - 3 + 2 * 1)/2 + 1 = 3 ，则输出窗口大小为 3 * 3，因为有2个输出，所以是 3*3*2。

2.2 Relu激活函数

相信看过卷积神经网络结构（CNN）的伙伴们都知道，激活函数无处不在，特别是CNN中，在卷积层后，全连接（FC）后都有激活函数Relu的身影，

那么这就自然不得不让我们产生疑问：

问题1、为什么要用激活函数？它的作用是什么？

问题2、在CNN中为什么要用Relu，相比于sigmoid，tanh，它的优势在什么地方？

对于第1个问题：由 y = w * x + b 可知，如果不用激活函数，每个网络层的输出都是一种线性输出，而我们所处的现实场景，其实更多的是各种非线性的分布。

这也说明了激活函数的作用是将线性分布转化为非线性分布，能更逼近我们的真实场景。

对于第2个问题：先看sigmoid，tanh分布

他们在 x -> 时，输出就变成了恒定值，因为求梯度时需要对函数求一阶偏导数，而不论是sigmoid，还是tanhx，他们的偏导都为0，

也就是存在所谓的梯度消失问题，最终也就会导致权重参数w ， b 无法更新。相比之下，Relu就不存在这样的问题，另外在 x > 0 时，

Relu求导 = 1，这对于反向传播计算dw，db，是能够大大的简化运算的。

使用sigmoid还会存在梯度爆炸的问题，比如在进行前向传播和反向传播迭代次数非常多的情况下，sigmoid因为是指数函数，其结果中

某些值会在迭代中累积，并成指数级增长，最终会出现NaN而导致溢出。

2.3 池化

池化层一般在卷积层+ Relu之后，它的作用是：

1、减小输入矩阵的大小（只是宽和高，而不是深度），提取主要特征。（不可否认的是，在池化后，特征会有一定的损失，所以，有些经典模型就去掉了池化这一层）。

它的目的是显而易见的，就是在后续操作时能降低运算。

2、一般采用mean_pooling（均值池化）和max_pooling（最大值池化），对于输入矩阵有translation（平移），rotation（旋转），能够保证特征的不变性。

mean_pooling 就是输入矩阵池化区域求均值，这里要注意的是池化窗口在输入矩阵滑动的步长跟stride有关，一般stride = 2.（图片是直接盗过来，这里感谢原创）

最右边7/4 => (1 + 1 + 2 + 3)/4

max_pooling 最大值池化，就是每个池化区域的最大值放在输出对应位置上。

2.4 全连接（full connection）

作用：分类器角色，将特征映射到样本标记空间，本质是矩阵变换（affine）。

至于变换的实现见后面的代码流程图，或者最好是跟一下代码，这样理解更透彻。

2.5 损失函数（softmax_loss）

作用：计算损失loss，从而求出梯度grad。

常用损失函数有:MSE均方误差，SVM（支持向量机）合页损失函数，Cross Entropy交叉熵损失函数。

这几种损失函数目前还看不出谁优谁劣，估计只有在具体的应用场景中去验证了。至于这几种损失函数的介绍，

大家可以去参考《常用损失函数小结》https://blog.csdn.net/zhangjunp3/article/details/80467350，这个哥们写得比较详细。

在后面的代码实例中，用到的是softmax_loss，它属于Cross Entropy交叉熵损失函数。

softmax计算公式：

其中，是要计算的类别的网络输出，分母是网络输出所有类别之和（共有个类别），表示第类的概率。

交叉熵损失：

其中，是类别的真实标签，表示第类的概率，是样本总数，是类别数。

梯度:

= 当 !=

= - 1 当 =

其中表示真实标签对应索引下预测的目标值，类别索引。

这个有点折磨人，原理讲解以及推导请大家可以参考这位大神的博客：http://www.cnblogs.com/zongfa/p/8971213.html。

2.6 前向传播（forward propagation）

前向传播包含之前的卷积，Relu激活函数，池化（pool），全连接(fc)，可以说，在损失函数之前操作都属于前向传播。

主要是权重参数w , b 初始化，迭代，以及更新w, b,生成分类器模型。

2.7 反向传播（back propagation）

反向传播包含损失函数，通过梯度计算dw，db，Relu激活函数逆变换，反池化，反全连接。

2.8 随机梯度下降（sgd_momentum）

作用：由梯度grad计算新的权重矩阵w

sgd公式：

其中，

缺点：其更新方向完全依赖于当前的batch，因而其更新十分不稳定。

为了解决这个问题，momentum就横空出世了，具体原理详解见下路派出所（这名字霸气）的博客http://www.cnblogs.com/callyblog/p/8299074.html。

momentum即动量，它模拟的是物体运动时的惯性，即更新的时候在一定程度上保留之前更新的方向，同时利用当前batch的梯度微调最终的更新方向。

这样一来，可以在一定程度上增加稳定性，从而学习地更快，并且还有一定摆脱局部最优的能力：

其中，

start.py

 1 # -*- coding: utf-8 -*-
 2 import matplotlib.pyplot as plt
 3 '''同路径下py模块引用'''
 4 
 5 try:
 6     from . import data_utils
 7     from . import solver
 8     from . import cnn
 9 except Exception:
10     import data_utils
11     import solver
12     import cnn
13 
14 import numpy as np
15 # 获取样本数据
16 data = data_utils.get_CIFAR10_data()
17 # model初始化（权重因子以及对应偏置 w1,b1 ,w2,b2 ,w3,b3，数量取决于网络层数）
18 model = cnn.ThreeLayerConvNet(reg=0.9)
19 solver = solver.Solver(model, data,
20                 lr_decay=0.95,                
21                 print_every=10, num_epochs=5, batch_size=2, 
22                 update_rule='sgd_momentum',                
23                 optim_config={'learning_rate': 5e-4, 'momentum': 0.9})
24 # 训练，获取最佳model
25 solver.train()                 
26 
27 plt.subplot(2, 1, 1) 
28 plt.title('Training loss')
29 plt.plot(solver.loss_history, 'o')
30 plt.xlabel('Iteration')
31 
32 plt.subplot(2, 1, 2)
33 plt.title('Accuracy')
34 plt.plot(solver.train_acc_history, '-o', label='train')
35 plt.plot(solver.val_acc_history, '-o', label='val')
36 plt.plot([0.5] * len(solver.val_acc_history), 'k--')
37 plt.xlabel('Epoch')
38 plt.legend(loc='lower right')
39 plt.gcf().set_size_inches(15, 12)
40 plt.show()
41 
42 
43 best_model = model
44 y_test_pred = np.argmax(best_model.loss(data['X_test']), axis=1)
45 y_val_pred = np.argmax(best_model.loss(data['X_val']), axis=1)
46 print ('Validation set accuracy: ',(y_val_pred == data['y_val']).mean())
47 print ('Test set accuracy: ', (y_test_pred == data['y_test']).mean())
48 # Validation set accuracy:  about 52.9%
49 # Test set accuracy:  about 54.7%
50 
51 
52 # Visualize the weights of the best network
53 """
54 from vis_utils import visualize_grid
55 
56 def show_net_weights(net):    
57     W1 = net.params['W1']    
58     W1 = W1.reshape(3, 32, 32, -1).transpose(3, 1, 2, 0)    
59     plt.imshow(visualize_grid(W1, padding=3).astype('uint8'))   
60     plt.gca().axis('off')    
61 show_net_weights(best_model)
62 plt.show()
63 """

View Code

cnn.py

 1 # -*- coding: utf-8 -*-
 2 try:
 3     from . import layer_utils
 4     from . import layers
 5 except Exception:
 6     import layer_utils
 7     import layers
 8 import numpy as np
 9 
10 class  ThreeLayerConvNet(object):
11     """    
12     A three-layer convolutional network with the following architecture:       
13        conv - relu - 2x2 max pool - affine - relu - affine - softmax
14     """
15 
16     def __init__(self, input_dim=(3, 32, 32), num_filters=32, filter_size=7,             
17                  hidden_dim=100, num_classes=10, weight_scale=1e-3, reg=0.0,
18                  dtype=np.float32):
19         self.params = {}
20         self.reg = reg
21         self.dtype = dtype
22 
23         # Initialize weights and biases
24         C, H, W = input_dim
25         self.params['W1'] = weight_scale * np.random.randn(num_filters, C, filter_size, filter_size)
26         self.params['b1'] = np.zeros(num_filters)
27         self.params['W2'] = weight_scale * np.random.randn(num_filters*H*W//4, hidden_dim)
28         self.params['b2'] = np.zeros(hidden_dim)
29         self.params['W3'] = weight_scale * np.random.randn(hidden_dim, num_classes)
30         self.params['b3'] = np.zeros(num_classes)
31 
32         for k, v in self.params.items():    
33             self.params[k] = v.astype(dtype)
34 
35 
36     def loss(self, X, y=None):
37         W1, b1 = self.params['W1'], self.params['b1']
38         W2, b2 = self.params['W2'], self.params['b2']
39         W3, b3 = self.params['W3'], self.params['b3']
40 
41         # pass conv_param to the forward pass for the convolutional layer
42         filter_size = W1.shape[2]
43         conv_param = {'stride': 1, 'pad': (filter_size - 1) // 2}
44 
45         # pass pool_param to the forward pass for the max-pooling layer
46         pool_param = {'pool_height': 2, 'pool_width': 2, 'stride': 2}
47 
48         # compute the forward pass
49         a1, cache1 = layer_utils.conv_relu_pool_forward(X, W1, b1, conv_param, pool_param)
50         a2, cache2 = layer_utils.affine_relu_forward(a1, W2, b2)
51         scores, cache3 = layers.affine_forward(a2, W3, b3)
52 
53         if y is None:    
54             return scores
55 
56         # compute the backward pass
57         data_loss, dscores = layers.softmax_loss(scores, y)
58         da2, dW3, db3 = layers.affine_backward(dscores, cache3)
59         da1, dW2, db2 = layer_utils.affine_relu_backward(da2, cache2)
60         dX, dW1, db1 = layer_utils.conv_relu_pool_backward(da1, cache1)
61 
62         # Add regularization 引入修正因子，重新计算损失，梯度
63         dW1 += self.reg * W1
64         dW2 += self.reg * W2
65         dW3 += self.reg * W3
66         reg_loss = 0.5 * self.reg * sum(np.sum(W * W) for W in [W1, W2, W3])
67 
68         loss = data_loss + reg_loss
69         grads = {'W1': dW1, 'b1': db1, 'W2': dW2, 'b2': db2, 'W3': dW3, 'b3': db3}
70 
71         return loss, grads

View Code

data.utils.py

  1 # -*- coding: utf-8 -*-
  2 import pickle 
  3 import numpy as np
  4 import os
  5 
  6 #from scipy.misc import imread
  7 
  8 def load_CIFAR_batch(filename):
  9   """ load single batch of cifar """
 10   with open(filename, 'rb') as f:
 11     datadict = pickle.load(f, encoding='bytes')
 12     X = datadict[b'data']
 13     Y = datadict[b'labels']
 14     X = X.reshape(10000, 3, 32, 32).transpose(0,2,3,1).astype("float")
 15     Y = np.array(Y)
 16     return X, Y
 17 
 18 def load_CIFAR10(ROOT):
 19   """ load all of cifar """
 20   xs = []
 21   ys = []
 22   for b in range(1,2):
 23     f = os.path.join(ROOT, 'data_batch_%d' % (b, ))
 24     X, Y = load_CIFAR_batch(f)
 25     xs.append(X)
 26     ys.append(Y)    
 27   Xtr = np.concatenate(xs)
 28   Ytr = np.concatenate(ys)
 29   del X, Y
 30   Xte, Yte = load_CIFAR_batch(os.path.join(ROOT, 'test_batch'))
 31   return Xtr, Ytr, Xte, Yte
 32 
 33 
 34 def get_CIFAR10_data(num_training=500, num_validation=50, num_test=50):
 35 
 36     """
 37     Load the CIFAR-10 dataset from disk and perform preprocessing to prepare
 38     it for classifiers. These are the same steps as we used for the SVM, but
 39     condensed to a single function.
 40     """
 41     # Load the raw CIFAR-10 data
 42 
 43     #cifar10_dir = 'C://download//cifar-10-python//cifar-10-batches-py//'
 44     cifar10_dir = '.\\cifar-10-batches-py\\'
 45     X_train, y_train, X_test, y_test = load_CIFAR10(cifar10_dir)
 46     print (X_train.shape)
 47     # Subsample the data
 48     mask = range(num_training, num_training + num_validation)
 49     X_val = X_train[mask]
 50     y_val = y_train[mask]
 51     mask = range(num_training)
 52     X_train = X_train[mask]
 53     y_train = y_train[mask]
 54     mask = range(num_test)
 55     X_test = X_test[mask]
 56     y_test = y_test[mask]
 57 
 58     # 标准化数据，求样本均值，然后 样本 - 样本均值，作用：使样本数据更收敛一些，便于后续处理
 59     # Normalize the data: subtract the mean image
 60     # 如果2维空间 m*n np.mean()后 => 1*n
 61     # 对于4维空间 m*n*k*j np.mean()后 => 1*n*k*j
 62     mean_image = np.mean(X_train, axis=0)
 63     X_train -= mean_image
 64     X_val -= mean_image
 65     X_test -= mean_image
 66 
 67     # 把通道channel 提前
 68     # Transpose so that channels come first
 69     X_train = X_train.transpose(0, 3, 1, 2).copy()
 70     X_val = X_val.transpose(0, 3, 1, 2).copy()
 71     X_test = X_test.transpose(0, 3, 1, 2).copy()
 72 
 73     # Package data into a dictionary
 74     return {
 75       'X_train': X_train, 'y_train': y_train,
 76       'X_val': X_val, 'y_val': y_val,
 77       'X_test': X_test, 'y_test': y_test,
 78     }
 79     
 80 """
 81 def load_tiny_imagenet(path, dtype=np.float32):
 82   
 83   Load TinyImageNet. Each of TinyImageNet-100-A, TinyImageNet-100-B, and
 84   TinyImageNet-200 have the same directory structure, so this can be used
 85   to load any of them.
 86 
 87   Inputs:
 88   - path: String giving path to the directory to load.
 89   - dtype: numpy datatype used to load the data.
 90 
 91   Returns: A tuple of
 92   - class_names: A list where class_names[i] is a list of strings giving the
 93     WordNet names for class i in the loaded dataset.
 94   - X_train: (N_tr, 3, 64, 64) array of training images
 95   - y_train: (N_tr,) array of training labels
 96   - X_val: (N_val, 3, 64, 64) array of validation images
 97   - y_val: (N_val,) array of validation labels
 98   - X_test: (N_test, 3, 64, 64) array of testing images.
 99   - y_test: (N_test,) array of test labels; if test labels are not available
100     (such as in student code) then y_test will be None.
101   
102   # First load wnids
103   with open(os.path.join(path, 'wnids.txt'), 'r') as f:
104     wnids = [x.strip() for x in f]
105 
106   # Map wnids to integer labels
107   wnid_to_label = {wnid: i for i, wnid in enumerate(wnids)}
108 
109   # Use words.txt to get names for each class
110   with open(os.path.join(path, 'words.txt'), 'r') as f:
111     wnid_to_words = dict(line.split('\t') for line in f)
112     for wnid, words in wnid_to_words.iteritems():
113       wnid_to_words[wnid] = [w.strip() for w in words.split(',')]
114   class_names = [wnid_to_words[wnid] for wnid in wnids]
115 
116   # Next load training data.
117   X_train = []
118   y_train = []
119   for i, wnid in enumerate(wnids):
120     if (i + 1) % 20 == 0:
121       print 'loading training data for synset %d / %d' % (i + 1, len(wnids))
122     # To figure out the filenames we need to open the boxes file
123     boxes_file = os.path.join(path, 'train', wnid, '%s_boxes.txt' % wnid)
124     with open(boxes_file, 'r') as f:
125       filenames = [x.split('\t')[0] for x in f]
126     num_images = len(filenames)
127     
128     X_train_block = np.zeros((num_images, 3, 64, 64), dtype=dtype)
129     y_train_block = wnid_to_label[wnid] * np.ones(num_images, dtype=np.int64)
130     for j, img_file in enumerate(filenames):
131       img_file = os.path.join(path, 'train', wnid, 'images', img_file)
132       img = imread(img_file)
133       if img.ndim == 2:
134         ## grayscale file
135         img.shape = (64, 64, 1)
136       X_train_block[j] = img.transpose(2, 0, 1)
137     X_train.append(X_train_block)
138     y_train.append(y_train_block)
139       
140   # We need to concatenate all training data
141   X_train = np.concatenate(X_train, axis=0)
142   y_train = np.concatenate(y_train, axis=0)
143   
144   # Next load validation data
145   with open(os.path.join(path, 'val', 'val_annotations.txt'), 'r') as f:
146     img_files = []
147     val_wnids = []
148     for line in f:
149       img_file, wnid = line.split('\t')[:2]
150       img_files.append(img_file)
151       val_wnids.append(wnid)
152     num_val = len(img_files)
153     y_val = np.array([wnid_to_label[wnid] for wnid in val_wnids])
154     X_val = np.zeros((num_val, 3, 64, 64), dtype=dtype)
155     for i, img_file in enumerate(img_files):
156       img_file = os.path.join(path, 'val', 'images', img_file)
157       img = imread(img_file)
158       if img.ndim == 2:
159         img.shape = (64, 64, 1)
160       X_val[i] = img.transpose(2, 0, 1)
161 
162   # Next load test images
163   # Students won't have test labels, so we need to iterate over files in the
164   # images directory.
165   img_files = os.listdir(os.path.join(path, 'test', 'images'))
166   X_test = np.zeros((len(img_files), 3, 64, 64), dtype=dtype)
167   for i, img_file in enumerate(img_files):
168     img_file = os.path.join(path, 'test', 'images', img_file)
169     img = imread(img_file)
170     if img.ndim == 2:
171       img.shape = (64, 64, 1)
172     X_test[i] = img.transpose(2, 0, 1)
173 
174   y_test = None
175   y_test_file = os.path.join(path, 'test', 'test_annotations.txt')
176   if os.path.isfile(y_test_file):
177     with open(y_test_file, 'r') as f:
178       img_file_to_wnid = {}
179       for line in f:
180         line = line.split('\t')
181         img_file_to_wnid[line[0]] = line[1]
182     y_test = [wnid_to_label[img_file_to_wnid[img_file]] for img_file in img_files]
183     y_test = np.array(y_test)
184   
185   return class_names, X_train, y_train, X_val, y_val, X_test, y_test
186 
187 """
188 def load_models(models_dir):
189   """
190   Load saved models from disk. This will attempt to unpickle all files in a
191   directory; any files that give errors on unpickling (such as README.txt) will
192   be skipped.
193 
194   Inputs:
195   - models_dir: String giving the path to a directory containing model files.
196     Each model file is a pickled dictionary with a 'model' field.
197 
198   Returns:
199   A dictionary mapping model file names to models.
200   """
201   models = {}
202   for model_file in os.listdir(models_dir):
203     with open(os.path.join(models_dir, model_file), 'rb') as f:
204       try:
205         models[model_file] = pickle.load(f)['model']
206       except pickle.UnpicklingError:
207         continue
208   return models

View Code

layer.utils.py

 1 # -*- coding: utf-8 -*-
 2 try:
 3   from . import layers
 4 except Exception:
 5   import layers
 6 
 7 
 8 
 9 
10 def affine_relu_forward(x, w, b):
11   """
12   Convenience layer that perorms an affine transform followed by a ReLU
13 
14   Inputs:
15   - x: Input to the affine layer
16   - w, b: Weights for the affine layer
17 
18   Returns a tuple of:
19   - out: Output from the ReLU
20   - cache: Object to give to the backward pass
21   """
22   a, fc_cache = layers.affine_forward(x, w, b)
23   out, relu_cache = layers.relu_forward(a)
24   cache = (fc_cache, relu_cache)
25   return out, cache
26 
27 
28 def affine_relu_backward(dout, cache):
29   """
30   Backward pass for the affine-relu convenience layer
31   """
32   fc_cache, relu_cache = cache
33   da = layers.relu_backward(dout, relu_cache)
34   dx, dw, db = layers.affine_backward(da, fc_cache)
35   return dx, dw, db
36 
37 
38 pass
39 
40 
41 def conv_relu_forward(x, w, b, conv_param):
42   """
43   A convenience layer that performs a convolution followed by a ReLU.
44 
45   Inputs:
46   - x: Input to the convolutional layer
47   - w, b, conv_param: Weights and parameters for the convolutional layer
48   
49   Returns a tuple of:
50   - out: Output from the ReLU
51   - cache: Object to give to the backward pass
52   """
53   a, conv_cache = layers.conv_forward_fast(x, w, b, conv_param)
54   out, relu_cache = layers.relu_forward(a)
55   cache = (conv_cache, relu_cache)
56   return out, cache
57 
58 
59 def conv_relu_backward(dout, cache):
60   """
61   Backward pass for the conv-relu convenience layer.
62   """
63   conv_cache, relu_cache = cache
64   da = layers.relu_backward(dout, relu_cache)
65   dx, dw, db = layers.conv_backward_fast(da, conv_cache)
66   return dx, dw, db
67 
68 
69 def conv_relu_pool_forward(x, w, b, conv_param, pool_param):
70   """
71   Convenience layer that performs a convolution, a ReLU, and a pool.
72 
73   Inputs:
74   - x: Input to the convolutional layer
75   - w, b, conv_param: Weights and parameters for the convolutional layer
76   - pool_param: Parameters for the pooling layer
77 
78   Returns a tuple of:
79   - out: Output from the pooling layer
80   - cache: Object to give to the backward pass
81   """
82   a, conv_cache = layers.conv_forward_naive(x, w, b, conv_param)
83   s, relu_cache = layers.relu_forward(a)
84   out, pool_cache = layers.max_pool_forward_naive(s, pool_param)
85   cache = (conv_cache, relu_cache, pool_cache)
86   return out, cache
87 
88 
89 def conv_relu_pool_backward(dout, cache):
90   """
91   Backward pass for the conv-relu-pool convenience layer
92   """
93   conv_cache, relu_cache, pool_cache = cache
94   ds = layers.max_pool_backward_naive(dout, pool_cache)
95   da = layers.relu_backward(ds, relu_cache)
96   dx, dw, db = layers.conv_backward_naive(da, conv_cache)
97   return dx, dw, db

View Code

layers.py

  1 import numpy as np
  2 
  3 '''
  4 全连接层：矩阵变换，获取对应目标相同的行与列
  5 输入x: 2*32*16*16 
  6 输入x_row: 2*8192
  7 超参w：8192*100
  8 输出：矩阵乘法 2*8192 ->8192*100 =>2*100
  9 '''
 10 def affine_forward(x, w, b):   
 11     """    
 12     Computes the forward pass for an affine (fully-connected) layer. 
 13     The input x has shape (N, d_1, ..., d_k) and contains a minibatch of N   
 14     examples, where each example x[i] has shape (d_1, ..., d_k). We will    
 15     reshape each input into a vector of dimension D = d_1 * ... * d_k, and    
 16     then transform it to an output vector of dimension M.    
 17     Inputs:    
 18     - x: A numpy array containing input data, of shape (N, d_1, ..., d_k)    
 19     - w: A numpy array of weights, of shape (D, M)    
 20     - b: A numpy array of biases, of shape (M,)   
 21     Returns a tuple of:    
 22     - out: output, of shape (N, M)    
 23     - cache: (x, w, b)   
 24     """
 25     out = None
 26     # Reshape x into rows
 27     N = x.shape[0]
 28     x_row = x.reshape(N, -1)         # (N,D) -1表示不知道多少列，指定行，就能算出列 = 2 * 32 * 16 * 16/2 = 8192
 29     out = np.dot(x_row, w) + b       # (N,M) 2*8192 8192*100 =>2 * 100
 30     cache = (x, w, b)
 31 
 32     return out, cache
 33 '''
 34 反向传播之affine矩阵变换
 35 根据dout求出dx,dw,db
 36 由 out = w * x =>
 37 dx = dout * w
 38 dw = dout * x
 39 db = dout * 1
 40 因为dx 与 x，dw 与 w，db 与 b 大小（维度）必须相同
 41 dx = dout * wT  矩阵乘法
 42 dw = dxT * dout 矩阵乘法
 43 db = dout 按列求和
 44 '''
 45 def affine_backward(dout, cache):   
 46     """    
 47     Computes the backward pass for an affine layer.    
 48     Inputs:    
 49     - dout: Upstream derivative, of shape (N, M)    
 50     - cache: Tuple of: 
 51     - x: Input data, of shape (N, d_1, ... d_k)    
 52     - w: Weights, of shape (D, M)    
 53     Returns a tuple of:   
 54     - dx: Gradient with respect to x, of shape (N, d1, ..., d_k)
 55       dx = dout * w
 56     - dw: Gradient with respect to w, of shape (D, M)
 57       dw = dout * x
 58     - db: Gradient with respect to b, of shape (M,)
 59       db = dout * 1
 60     """
 61 
 62     x, w, b = cache    
 63     dx, dw, db = None, None, None
 64     dx = np.dot(dout, w.T)                       # (N,D)
 65     # dx维度必须跟x维度相同
 66     dx = np.reshape(dx, x.shape)                 # (N,d1,...,d_k)
 67     # 转换成二维矩阵
 68     x_row = x.reshape(x.shape[0], -1)            # (N,D)
 69     dw = np.dot(x_row.T, dout)                   # (D,M)
 70 
 71     db = np.sum(dout, axis=0, keepdims=True)     # (1,M)    
 72 
 73     return dx, dw, db
 74 
 75 def relu_forward(x):   
 76     """ 激活函数，解决sigmoid梯度消失问题，网络性能比sigmoid更好
 77     Computes the forward pass for a layer of rectified linear units (ReLUs).    
 78     Input:    
 79     - x: Inputs, of any shape    
 80     Returns a tuple of:    
 81     - out: Output, of the same shape as x    
 82     - cache: x    
 83     """   
 84     out = None    
 85     out = ReLU(x)    
 86     cache = x    
 87 
 88     return out, cache
 89 
 90 def relu_backward(dout, cache):   
 91     """  
 92     Computes the backward pass for a layer of rectified linear units (ReLUs).   
 93     Input:    
 94     - dout: Upstream derivatives, of any shape    
 95     - cache: Input x, of same shape as dout    
 96     Returns:    
 97     - dx: Gradient with respect to x    
 98     """    
 99     dx, x = None, cache    
100     dx = dout    
101     dx[x <= 0] = 0    
102 
103     return dx
104 
105 def svm_loss(x, y):   
106     """    
107     Computes the loss and gradient using for multiclass SVM classification.    
108     Inputs:    
109     - x: Input data, of shape (N, C) where x[i, j] is the score for the jth class         
110          for the ith input.    
111     - y: Vector of labels, of shape (N,) where y[i] is the label for x[i] and         
112          0 <= y[i] < C   
113     Returns a tuple of:    
114     - loss: Scalar giving the loss   
115     - dx: Gradient of the loss with respect to x    
116     """    
117     N = x.shape[0]   
118     correct_class_scores = x[np.arange(N), y]    
119     margins = np.maximum(0, x - correct_class_scores[:, np.newaxis] + 1.0)    
120     margins[np.arange(N), y] = 0   
121     loss = np.sum(margins) / N   
122     num_pos = np.sum(margins > 0, axis=1)    
123     dx = np.zeros_like(x)   
124     dx[margins > 0] = 1    
125     dx[np.arange(N), y] -= num_pos    
126     dx /= N    
127 
128     return loss, dx
129 '''
130 softmax_loss 求梯度优点: 求梯度运算简单，方便 
131 softmax: softmax用于多分类过程中，它将多个神经元的输出，映射到（0,1）区间内，
132 可以看成概率来理解，从而来进行多分类。
133 Si = exp(i)/[exp(j)求和]
134 softmax_loss：损失函数，求梯度dx必须用到损失函数，通过梯度下降更新超参
135 Loss = -[Ypred*ln(Sj真实类别位置的概率值)]求和
136 梯度dx : 对损失函数求一阶偏导
137 如果 j = i =>dx = Sj - 1
138 如果 j != i => dx = Sj
139 '''
140 def softmax_loss(x, y):    
141     """    
142     Computes the loss and gradient for softmax classification.    Inputs:    
143     - x: Input data, of shape (N, C) where x[i, j] is the score for the jth class         
144     for the ith input.    
145     - y: Vector of labels, of shape (N,) where y[i] is the label for x[i] and         
146          0 <= y[i] < C   
147     Returns a tuple of:    
148     - loss: Scalar giving the loss    
149     - dx: Gradient of the loss with respect to x   
150     """
151     '''
152      x - np.max(x, axis=1, keepdims=True) 对数据进行预处理，
153      防止np.exp(x - np.max(x, axis=1, keepdims=True))得到结果太分散；
154      np.max(x, axis=1, keepdims=True)保证所得结果维度不变；
155     '''
156     probs = np.exp(x - np.max(x, axis=1, keepdims=True))
157     # 计算softmax，准确的说应该是soft，因为还没有选取概率最大值的操作
158     probs /= np.sum(probs, axis=1, keepdims=True)
159     # 样本图片个数
160     N = x.shape[0]
161     # 计算图片损失
162     loss = -np.sum(np.log(probs[np.arange(N), y])) / N
163     # 复制概率
164     dx = probs.copy()
165     # 针对 i = j 求梯度
166     dx[np.arange(N), y] -= 1
167     # 计算每张样本图片梯度
168     dx /= N    
169 
170     return loss, dx
171 
172 def ReLU(x):    
173     """ReLU non-linearity."""    
174     return np.maximum(0, x)
175 '''
176 功能：获取图片特征
177 前向卷积：每次用一个3维的卷积核与图片RGB各个通道分别卷积（卷积核1与R进行点积，卷积核2与G点积，卷积核3与B点积）,
178 然后将3个结果求和（也就是 w*x ）,再加上 b，就是新结果某一位置输出，这是卷积核在图片某一固定小范围内（卷积核大小）的卷积，
179 要想获得整个图片的卷积结果，需要在图片上滑动卷积核（先右后下），直至遍历整个图片。
180 x: 2*3*32*32  每次选取2张图片，图片大小32*32，彩色(3通道)
181 w: 32*3*7*7   卷积核每个大小是7*7；对应输入x的3通道，所以是3维，有32个卷积核
182 pad = 3(图片边缘行列补0)，stride = 1(卷积核移动步长)
183 输出宽*高结果：(32-7+2*3)/1 + 1 = 32
184 输出大小：2*32*32*32
185 '''
186 def conv_forward_naive(x, w, b, conv_param):
187     stride, pad = conv_param['stride'], conv_param['pad']
188     N, C, H, W = x.shape
189     F, C, HH, WW = w.shape
190     x_padded = np.pad(x, ((0, 0), (0, 0), (pad, pad), (pad, pad)), mode='constant')
191     '''// : 求整型'''
192     H_new = 1 + (H + 2 * pad - HH) // stride
193     W_new = 1 + (W + 2 * pad - WW) // stride
194     s = stride
195     out = np.zeros((N, F, H_new, W_new))
196 
197     for i in range(N):       # ith image    
198         for f in range(F):   # fth filter        
199             for j in range(H_new):            
200                 for k in range(W_new):   
201                     #print x_padded[i, :, j*s:HH+j*s, k*s:WW+k*s].shape
202                     #print w[f].shape  
203                     #print b.shape  
204                     #print np.sum((x_padded[i, :, j*s:HH+j*s, k*s:WW+k*s] * w[f]))         
205                     out[i, f, j, k] = np.sum(x_padded[i, :, j*s:HH+j*s, k*s:WW+k*s] * w[f]) + b[f]
206 
207     cache = (x, w, b, conv_param)
208 
209     return out, cache
210 
211 '''
212 反向传播之卷积：卷积核3*7*7
213 输入dout:2*32*32*32
214 输出dx:2*3*32*32
215 '''
216 def conv_backward_naive(dout, cache):
217 
218     x, w, b, conv_param = cache
219     # 边界补0
220     pad = conv_param['pad']
221     # 步长
222     stride = conv_param['stride']
223     F, C, HH, WW = w.shape
224     N, C, H, W = x.shape
225     H_new = 1 + (H + 2 * pad - HH) // stride
226     W_new = 1 + (W + 2 * pad - WW) // stride
227 
228     dx = np.zeros_like(x)
229     dw = np.zeros_like(w)
230     db = np.zeros_like(b)
231 
232     s = stride
233     x_padded = np.pad(x, ((0, 0), (0, 0), (pad, pad), (pad, pad)), 'constant')
234     dx_padded = np.pad(dx, ((0, 0), (0, 0), (pad, pad), (pad, pad)), 'constant')
235     # 图片个数
236     for i in range(N):       # ith image
237         # 卷积核滤波个数
238         for f in range(F):   # fth filter        
239             for j in range(H_new):            
240                 for k in range(W_new):
241                     # 3*7*7
242                     window = x_padded[i, :, j*s:HH+j*s, k*s:WW+k*s]
243                     db[f] += dout[i, f, j, k]
244                     # 3*7*7
245                     dw[f] += window * dout[i, f, j, k]
246                     # 3*7*7 => 2*3*38*38
247                     dx_padded[i, :, j*s:HH+j*s, k*s:WW+k*s] += w[f] * dout[i, f, j, k]
248 
249     # Unpad
250     dx = dx_padded[:, :, pad:pad+H, pad:pad+W]
251 
252     return dx, dw, db
253 '''
254 功能：减少特征尺寸大小
255 前向最大池化：在特征矩阵中选取指定大小窗口，获取窗口内元素最大值作为输出窗口映射值，
256 先有后下遍历，直至获取整个特征矩阵对应的新映射特征矩阵。
257 输入x：2*32*32*32
258 池化参数：窗口：2*2，步长：2
259 输出窗口宽，高：(32-2)/2 + 1 = 16
260 输出大小：2*32*16*16
261 '''
262 def max_pool_forward_naive(x, pool_param):
263     HH, WW = pool_param['pool_height'], pool_param['pool_width']
264     s = pool_param['stride']
265     N, C, H, W = x.shape
266     H_new = 1 + (H - HH) // s
267     W_new = 1 + (W - WW) // s
268     out = np.zeros((N, C, H_new, W_new))
269     for i in range(N):    
270         for j in range(C):        
271             for k in range(H_new):            
272                 for l in range(W_new):                
273                     window = x[i, j, k*s:HH+k*s, l*s:WW+l*s] 
274                     out[i, j, k, l] = np.max(window)
275 
276     cache = (x, pool_param)
277 
278     return out, cache
279 
280 '''
281 反向传播之池化：增大特征尺寸大小
282 在缓存中取出前向池化时输入特征，选取某一范围矩阵窗口，
283 找出最大值所在的位置，根据这个位置将dout值映射到新的矩阵对应位置上，
284 而新矩阵其他位置都初始化为0.
285 输入dout:2*32*16*16
286 输出dx:2*32*32*32
287 '''
288 def max_pool_backward_naive(dout, cache):
289     x, pool_param = cache
290     HH, WW = pool_param['pool_height'], pool_param['pool_width']
291     s = pool_param['stride']
292     N, C, H, W = x.shape
293     H_new = 1 + (H - HH) // s
294     W_new = 1 + (W - WW) // s
295     dx = np.zeros_like(x)
296     for i in range(N):    
297         for j in range(C):        
298             for k in range(H_new):            
299                 for l in range(W_new):
300                     # 取前向传播时输入的某一池化窗口
301                     window = x[i, j, k*s:HH+k*s, l*s:WW+l*s]
302                     # 计算窗口最大值
303                     m = np.max(window)
304                     # 根据最大值所在位置以及dout对应值=>新矩阵窗口数值
305                     # [false,false
306                     #  true, false]  * 1 => [0,0
307                     #                        1,0]
308                     dx[i, j, k*s:HH+k*s, l*s:WW+l*s] = (window == m) * dout[i, j, k, l]
309 
310     return dx

View Code

optim.py

  1 import numpy as np
  2 
  3 def sgd(w, dw, config=None):    
  4     """    
  5     Performs vanilla stochastic gradient descent.    
  6     config format:    
  7     - learning_rate: Scalar learning rate.    
  8     """    
  9     if config is None: config = {}    
 10     config.setdefault('learning_rate', 1e-2)    
 11     w -= config['learning_rate'] * dw    
 12 
 13     return w, config
 14 '''
 15 SGD:随机梯度下降：由梯度计算新的权重矩阵w
 16 sgd_momentum 是sgd的改进版，解决sgd更新不稳定，陷入局部最优的问题。
 17 增加一个动量因子momentum，可以在一定程度上增加稳定性，
 18 从而学习地更快，并且还有一定摆脱局部最优的能力。
 19 
 20 '''
 21 def sgd_momentum(w, dw, config=None):    
 22     """    
 23     Performs stochastic gradient descent with momentum.    
 24     config format:    
 25     - learning_rate: Scalar learning rate.    
 26     - momentum: Scalar between 0 and 1 giving the momentum value.                
 27     Setting momentum = 0 reduces to sgd.    
 28     - velocity（速度）: A numpy array of the same shape as w and dw used to store a moving
 29     average of the gradients.   
 30     """   
 31     if config is None: config = {}    
 32     config.setdefault('learning_rate', 1e-2)   
 33     config.setdefault('momentum', 0.9)
 34     # config 如果存在属性velocity，则获取config['velocity']，否则获取np.zeros_like(w)
 35     v = config.get('velocity', np.zeros_like(w))    
 36     next_w = None    
 37     v = config['momentum'] * v - config['learning_rate'] * dw    
 38     next_w = w + v    
 39     config['velocity'] = v    
 40 
 41     return next_w, config
 42 
 43 def rmsprop(x, dx, config=None):    
 44     """    
 45     Uses the RMSProp update rule, which uses a moving average of squared gradient    
 46     values to set adaptive per-parameter learning rates.    
 47     config format:    
 48     - learning_rate: Scalar learning rate.    
 49     - decay_rate: Scalar between 0 and 1 giving the decay rate for the squared                  
 50     gradient cache.    
 51     - epsilon: Small scalar used for smoothing to avoid dividing by zero.    
 52     - cache: Moving average of second moments of gradients.   
 53     """    
 54     if config is None: config = {}    
 55     config.setdefault('learning_rate', 1e-2)  
 56     config.setdefault('decay_rate', 0.99)    
 57     config.setdefault('epsilon', 1e-8)    
 58     config.setdefault('cache', np.zeros_like(x))    
 59     next_x = None    
 60     cache = config['cache']    
 61     decay_rate = config['decay_rate']    
 62     learning_rate = config['learning_rate']    
 63     epsilon = config['epsilon']    
 64     cache = decay_rate * cache + (1 - decay_rate) * (dx**2)    
 65     x += - learning_rate * dx / (np.sqrt(cache) + epsilon)  
 66     config['cache'] = cache    
 67     next_x = x    
 68 
 69     return next_x, config
 70 
 71 def adam(x, dx, config=None):    
 72     """    
 73     Uses the Adam update rule, which incorporates moving averages of both the  
 74     gradient and its square and a bias correction term.    
 75     config format:    
 76     - learning_rate: Scalar learning rate.    
 77     - beta1: Decay rate for moving average of first moment of gradient.    
 78     - beta2: Decay rate for moving average of second moment of gradient.   
 79     - epsilon: Small scalar used for smoothing to avoid dividing by zero.    
 80     - m: Moving average of gradient.    
 81     - v: Moving average of squared gradient.    
 82     - t: Iteration number.   
 83     """    
 84     if config is None: config = {}    
 85     config.setdefault('learning_rate', 1e-3)    
 86     config.setdefault('beta1', 0.9)    
 87     config.setdefault('beta2', 0.999)    
 88     config.setdefault('epsilon', 1e-8)    
 89     config.setdefault('m', np.zeros_like(x))    
 90     config.setdefault('v', np.zeros_like(x))    
 91     config.setdefault('t', 0)   
 92     next_x = None    
 93     m = config['m']    
 94     v = config['v']    
 95     beta1 = config['beta1']    
 96     beta2 = config['beta2']    
 97     learning_rate = config['learning_rate']    
 98     epsilon = config['epsilon']   
 99     t = config['t']    
100     t += 1    
101     m = beta1 * m + (1 - beta1) * dx    
102     v = beta2 * v + (1 - beta2) * (dx**2)    
103     m_bias = m / (1 - beta1**t)    
104     v_bias = v / (1 - beta2**t)    
105     x += - learning_rate * m_bias / (np.sqrt(v_bias) + epsilon)    
106     next_x = x    
107     config['m'] = m    
108     config['v'] = v    
109     config['t'] = t    
110 
111     return next_x, config

View Code

solver.py

  1 import numpy as np
  2 try:
  3   from . import optim
  4 except Exception:
  5   import optim
  6 
  7 class Solver(object):
  8   """
  9   A Solver encapsulates all the logic necessary for training classification
 10   models. The Solver performs stochastic gradient descent using different
 11   update rules defined in optim.py.
 12 
 13   The solver accepts both training and validataion data and labels so it can
 14   periodically check classification accuracy on both training and validation
 15   data to watch out for overfitting.
 16 
 17   To train a model, you will first construct a Solver instance, passing the
 18   model, dataset, and various optoins (learning rate, batch size, etc) to the
 19   constructor. You will then call the train() method to run the optimization
 20   procedure and train the model.
 21   
 22   After the train() method returns, model.params will contain the parameters
 23   that performed best on the validation set over the course of training.
 24   In addition, the instance variable solver.loss_history will contain a list
 25   of all losses encountered during training and the instance variables
 26   solver.train_acc_history and solver.val_acc_history will be lists containing
 27   the accuracies of the model on the training and validation set at each epoch.
 28   
 29   Example usage might look something like this:
 30   
 31   data = {
 32     'X_train': # training data
 33     'y_train': # training labels
 34     'X_val': # validation data
 35     'X_train': # validation labels
 36   }
 37   model = MyAwesomeModel(hidden_size=100, reg=10)
 38   solver = Solver(model, data,
 39                   update_rule='sgd',
 40                   optim_config={
 41                     'learning_rate': 1e-3,
 42                   },
 43                   lr_decay=0.95,
 44                   num_epochs=10, batch_size=100,
 45                   print_every=100)
 46   solver.train()
 47 
 48 
 49   A Solver works on a model object that must conform to the following API:
 50 
 51   - model.params must be a dictionary mapping string parameter names to numpy
 52     arrays containing parameter values.
 53 
 54   - model.loss(X, y) must be a function that computes training-time loss and
 55     gradients, and test-time classification scores, with the following inputs
 56     and outputs:
 57 
 58     Inputs:
 59     - X: Array giving a minibatch of input data of shape (N, d_1, ..., d_k)
 60     - y: Array of labels, of shape (N,) giving labels for X where y[i] is the
 61       label for X[i].
 62 
 63     Returns:
 64     If y is None, run a test-time forward pass and return:
 65     - scores: Array of shape (N, C) giving classification scores for X where
 66       scores[i, c] gives the score of class c for X[i].
 67 
 68     If y is not None, run a training time forward and backward pass and return
 69     a tuple of:
 70     - loss: Scalar giving the loss
 71     - grads: Dictionary with the same keys as self.params mapping parameter
 72       names to gradients of the loss with respect to those parameters.
 73   """
 74 
 75   def __init__(self, model, data, **kwargs):
 76     """
 77     Construct a new Solver instance.
 78     
 79     Required arguments:
 80     - model: A model object conforming to the API described above
 81     - data: A dictionary of training and validation data with the following:
 82       'X_train': Array of shape (N_train, d_1, ..., d_k) giving training images
 83       'X_val': Array of shape (N_val, d_1, ..., d_k) giving validation images
 84       'y_train': Array of shape (N_train,) giving labels for training images
 85       'y_val': Array of shape (N_val,) giving labels for validation images
 86       
 87     Optional arguments:
 88     - update_rule: A string giving the name of an update rule in optim.py.
 89       Default is 'sgd'.
 90     - optim_config: A dictionary containing hyperparameters that will be
 91       passed to the chosen update rule. Each update rule requires different
 92       hyperparameters (see optim.py) but all update rules require a
 93       'learning_rate' parameter so that should always be present.
 94     - lr_decay: A scalar for learning rate decay; after each epoch the learning
 95       rate is multiplied by this value.
 96     - batch_size: Size of minibatches used to compute loss and gradient during
 97       training.
 98     - num_epochs: The number of epochs to run for during training.
 99     - print_every: Integer; training losses will be printed every print_every
100       iterations.
101     - verbose: Boolean; if set to false then no output will be printed during
102       training.
103     """
104     self.model = model
105     self.X_train = data['X_train']
106     self.y_train = data['y_train']
107     self.X_val = data['X_val']
108     self.y_val = data['y_val']
109     
110     # Unpack keyword arguments
111     # pop(key, default):删除kwargs对象中key，如果存在该key，返回该key对应的value，否则，返回default值。
112     self.update_rule = kwargs.pop('update_rule', 'sgd')
113     self.optim_config = kwargs.pop('optim_config', {})
114     self.lr_decay = kwargs.pop('lr_decay', 1.0)
115     self.batch_size = kwargs.pop('batch_size', 2)
116     self.num_epochs = kwargs.pop('num_epochs', 10)
117 
118     self.print_every = kwargs.pop('print_every', 10)
119     self.verbose = kwargs.pop('verbose', True)
120 
121     # Throw an error if there are extra keyword arguments
122     # 删除kwargs中参数后，校验是否还有多余参数
123     if len(kwargs) > 0:
124       extra = ', '.join('"%s"' % k for k in kwargs.keys())
125       raise ValueError('Unrecognized arguments %s' % extra)
126 
127     # Make sure the update rule exists, then replace the string
128     # name with the actual function
129     # 检查optim对象中是否有属性或方法名为self.update_rule
130     if not hasattr(optim, self.update_rule):
131       raise ValueError('Invalid update_rule "%s"' % self.update_rule)
132     self.update_rule = getattr(optim, self.update_rule)
133 
134     self._reset()
135 
136 
137   def _reset(self):
138     """
139     Set up some book-keeping variables for optimization. Don't call this
140     manually.
141     """
142     # Set up some variables for book-keeping
143     self.epoch = 0
144     self.best_val_acc = 0
145     self.best_params = {}
146     self.loss_history = []
147     self.train_acc_history = []
148     self.val_acc_history = []
149 
150     # Make a deep copy of the optim_config for each parameter
151     self.optim_configs = {}
152     for p in self.model.params:
153       d = {k: v for k, v in self.optim_config.items()}
154       self.optim_configs[p] = d
155 
156 
157   def _step(self):
158     """
159     Make a single gradient update. This is called by train() and should not
160     be called manually.
161     """
162     # Make a minibatch of training data
163     # 500 张图片
164     num_train = self.X_train.shape[0]
165     # 随机选出batch_size：2 张
166     batch_mask = np.random.choice(num_train, self.batch_size)
167 
168    # batch_mask = [t%(num_train//2), num_train//2 + t%(num_train//2)]
169 
170 
171 
172     # 训练样本矩阵[2,3,32,32]
173     X_batch = self.X_train[batch_mask]
174     # 标签矩阵[2,] 图片类型
175     y_batch = self.y_train[batch_mask]
176 
177     # Compute loss and gradient
178     loss, grads = self.model.loss(X_batch, y_batch)
179     self.loss_history.append(loss)
180 
181     # 更新模型超参(w1,b1),(w2,b2),(w3,b3)，以及保存更新超参时对应参数因子
182     # Perform a parameter update
183     for p, w in self.model.params.items():
184       dw = grads[p]
185       config = self.optim_configs[p]
186       next_w, next_config = self.update_rule(w, dw, config)
187       self.model.params[p] = next_w
188       # 保存参数因子，learning_rate(学习率)，velocity(速度)
189       self.optim_configs[p] = next_config
190 
191 
192   def check_accuracy(self, X, y, num_samples=None, batch_size=2):
193     """
194     Check accuracy of the model on the provided data.
195     
196     Inputs:
197     - X: Array of data, of shape (N, d_1, ..., d_k)
198     - y: Array of labels, of shape (N,)
199     - num_samples: If not None, subsample the data and only test the model
200       on num_samples datapoints.
201     - batch_size: Split X and y into batches of this size to avoid using too
202       much memory.
203       
204     Returns:
205     - acc: Scalar giving the fraction of instances that were correctly
206       classified by the model.
207     """
208     
209     # Maybe subsample the data
210     N = X.shape[0]
211     if num_samples is not None and N > num_samples:
212       # 随机选取num_samples张图片，返回选取图片索引
213       mask = np.random.choice(N, num_samples)
214       N = num_samples
215       X = X[mask]
216       y = y[mask]
217 
218     # Compute predictions in batches
219     num_batches = N // batch_size
220     if N % batch_size != 0:
221       num_batches += 1
222     y_pred = []
223     for i in range(num_batches):
224       start = i * batch_size
225       end = (i + 1) * batch_size
226       scores = self.model.loss(X[start:end])
227       y_pred.append(np.argmax(scores, axis=1))
228     y_pred = np.hstack(y_pred)
229     acc = np.mean(y_pred == y)
230 
231     return acc
232 
233   '''
234    训练模型：核心方法
235    epoch > batch_size > iteration >= 1
236    训练总的次数 = num_epochs * iterations_per_epoch
237   '''
238   def train(self):
239     """
240     Run optimization to train the model.
241     """
242     num_train = self.X_train.shape[0]
243     iterations_per_epoch = max(num_train // self.batch_size, 1)
244     num_iterations = self.num_epochs * iterations_per_epoch
245     # 迭代总的次数
246     for t in range(num_iterations):
247       # 某次iteration训练
248       self._step()
249 
250       # Maybe print training loss
251       # verbose：是否显示详细信息
252       if self.verbose and t % self.print_every == 0:
253         print ('(Iteration %d / %d) loss: %f' % (
254                t + 1, num_iterations, self.loss_history[-1]))
255 
256       # At the end of every epoch, increment the epoch counter and decay the
257       # learning rate.
258       # 每迭代完一次epoch后，更新学习率learning_rate，加快运算效率。
259       epoch_end = (t + 1) % iterations_per_epoch == 0
260       if epoch_end:
261         self.epoch += 1
262         for k in self.optim_configs:
263           self.optim_configs[k]['learning_rate'] *= self.lr_decay
264 
265       # Check train and val accuracy on the first iteration, the last
266       # iteration, and at the end of each epoch.
267       # 在第1次迭代，最后1次迭代，或者运行完一个epoch后，校验训练结果。
268       first_it = (t == 0)
269       last_it = (t == num_iterations + 1)
270       if first_it or last_it or epoch_end:
271         train_acc = self.check_accuracy(self.X_train, self.y_train,
272                                         num_samples=4)
273         val_acc = self.check_accuracy(self.X_val, self.y_val,num_samples=4)
274         self.train_acc_history.append(train_acc)
275         self.val_acc_history.append(val_acc)
276 
277         if self.verbose:
278           print ('(Epoch %d / %d) train acc: %f; val_acc: %f' % (
279                  self.epoch, self.num_epochs, train_acc, val_acc))
280 
281         # Keep track of the best model
282         if val_acc > self.best_val_acc:
283           self.best_val_acc = val_acc
284           self.best_params = {}
285           for k, v in self.model.params.items():
286             self.best_params[k] = v.copy()
287 
288     # At the end of training swap the best params into the model
289     self.model.params = self.best_params

View Code

5.运行结果以及分析

这里选取500张图片作为训练样本，epoch = 5，batch = 2，每次随机选取2张图片，迭代 5 * 500/2 = 1250次，测试样本选取50张。

由运行结果可以看出，损失loss是逐步下降的。

测试结果只有12%左右，原因有以下几点：

1. 模型比较简单，特征提取不能反映真实特征（一次卷积）；

2. 会出现过拟合问题；

3. 原始训练数据分类图片纹理复杂，这些图片可变性大，从而导致分类结果准确度低；

(airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck)

后续会通过tensorflow来实现CNN，测试准确率可以达到71.95%。

6. 参考文献

视觉一只白的博客《常用损失函数小结》https://blog.csdn.net/zhangjunp3/article/details/80467350

理想万岁的博客《Softmax函数详解与推导》：http://www.cnblogs.com/zongfa/p/8971213.html

下路派出所的博客《深度学习（九）深度学习最全优化方法总结比较（SGD，Momentum，Nesterov Momentum，Adagrad，Adadelta，RMSprop，Adam）》

http://www.cnblogs.com/callyblog/p/8299074.html

不要让懒惰占据你的大脑，不要让妥协拖垮了你的人生。青春就是一张票，能不能赶上时代的快车，你的步伐就掌握在你的脚下。

你可能感兴趣的:(深度学习之卷积神经网络(CNN)详解与代码实现（一）)

机器学习--DBSCAN聚类算法详解 2201_75491841 机器学习算法聚类人工智能
目录引言1.什么是DBSCAN聚类？2.DBSCAN聚类算法的原理3.DBSCAN算法的核心概念3.1邻域（Neighborhood）3.2核心点（CorePoint）3.3直接密度可达（DirectlyDensity-Reachable）3.4密度可达（Density-Reachable）3.5密度相连（Density-Connected）4.DBSCAN算法的步骤5.DBSCAN算法的优缺点5
小白学AI量化：DeepSeek+Python构建强大的金融数据挖掘与多维分析机器人老余捞鱼 AI顾投高级策略 AI探讨与学习人工智能 python 金融 deepseek
作者：老余捞鱼原创不易，转载请标明出处及原作者。写在前面的话：在机构主导的量化交易时代，普通投资者如何用一杯奶茶的钱（15元/天）打造专业级智能量化产品？本文将为您揭秘一个革命性的解决方案——基于国产大模型DeepSeek和Python构建的智能数据挖掘分析机器人。它不仅适用于通用网页数据抓取，更能深度应用于金融领域，精准捕捉市场信号。本文“干货”很多，请务必耐心读完。一、颠覆认知的性价比革命1.
项目解决方案：博物馆视频监控系统建设解决方案威迪斯特解决方案博物馆视频联网资源汇聚监控软件音视频客户端
目录一.相关项目背景1.1技术革新背景1.2技术提升背景1.3发展趋势背景1.4合规性要求背景二.与客户确认的需求2.1实时远程监控与高清识别2.2权限管理灵活且安全2.3多平台访问，随时随地监控2.4视频汇聚，集中管理三.实现特点3.1智能预警，守护安全3.2全面支持，无忧运营3.3成本控制，效益最大化3.4拓展集成，信息联动四.必实现的建设目标4.1安全可靠4.2贴合需要4.3易于拓展4.4性
软考中级软件设计师考点知识点笔记总结 day06 莫问alicia 软考中级软件设计师笔记数据结构算法
文章目录6、树和二叉树6.1、树的基本概念6.2、二叉树的基本概念6.3、二叉树的遍历6.4、查找二叉树（二叉排序树）BST6.5、构造霍夫曼树+6.6、线索二叉树6.7、平衡二叉树7、图7.1、存储结构-邻接矩阵7.2、存储结构-邻接表7.3、图的遍历7.4、拓扑排序7.5、最小生成树普利姆算法7.6、克鲁斯卡尔算法6、树和二叉树6.1、树的基本概念结点的度：一个结点的度是指该结点拥有的子树数量
Bash 脚本基础 HXQ_晴天 linux bash chrome 开发语言
一、Bash脚本基础什么是Bash脚本：Bash脚本是一种文本文件，其中包含了一系列的命令，这些命令可以被Bashshell执行。它用于自动化重复性的任务，提高工作效率。Bash脚本的基本结构：以#!/bin/bash开头，表示使用Bash解释器来执行脚本。接下来是具体的命令和操作。二、编写和运行第一个Bash脚本创建脚本文件：使用命令nanohello.sh创建名为hello.sh的文件。编写脚
【机器学习】机器学习工程实战-第3章数据收集和准备腊肉芥末果机器学习工程实战机器学习人工智能
上一章：第2章项目开始前文章目录3.1关于数据的问题3.1.1数据是否可获得3.1.2数据是否相当大3.1.3数据是否可用3.1.4数据是否可理解3.1.5数据是否可靠3.2数据的常见问题3.2.1高成本3.2.2质量差3.2.3噪声（noise）3.2.4偏差（bias）3.2.5预测能力低（lowpredictivepower）3.2.6过时的样本3.2.7离群值3.2.8数据泄露/目标泄漏3
ngx_http_conf_port_t 若云止水 http 网络协议网络
定义在src\http\ngx_http_core_module.htypedefstruct{ngx_int_tfamily;in_port_tport;ngx_array_taddrs;/*arrayofngx_http_conf_addr_t*/}ngx_http_conf_port_t;该结构体用于在Nginx配置阶段存储监听端口的配置信息，是listen指令解析后的核心数据结构。它将同一
python processpoolexecutor_Python多进程解决方案multiprocessing ProcessPoolExecutor weixin_39599046 python
大多数编程语言都会有多线程和多进程的概念，至于线程和进程的概念，大家可以百度一下。作为一门胶水语言，Python毫不意外，也可以利用多线程和多进程处理并发问题，但是多线程由于GIL的存在，起作用范围大打折扣，仅限于在IO等场景可以发挥点作用。所以，今天要跟大家分享的是Python多进程方案，更好地利用系统多核，从而提升性能。基础方案一：利用Process新建一个子进程，在子进程执行任务。我们写一个
html5 相册翻转效果,HTML5 css3：3D旋转木马效果相册岑依惜 html5 相册翻转效果
这篇博客的目的是因为上篇HTML5CSS3专题诱人的实例CSS3打造百度贴吧的3D翻牌效果中有个关于CSS3D效果的比较重要的知识点没讲到，就是perspective和tranlateY效果图：嘿嘿，我把大学毕业时的一些照片，做成旋转木马，绕着我大文理旋转，不忘母校的培育之恩~1、perspectiveperspective属性包括两个属性：none和具有单位的长度值。其中perspective属
机器学习实战第一章机器学习基础 LuoY、 Machine Learning 机器学习算法人工智能
第一章机器学习1.1何谓机器学习1.2关键术语1.3机器学习的主要任务1.4如何选择合适的算法1.5开发机器学习应用程序的步骤1.6Python语言的优势1.1何谓机器学习 1、简单地说，机器学习就是把无序的数据转换成有用的信息； 2、机器学习能让我们自数据集中受启发，我们会利用计算机来彰显数据背后的真实含义； 3、机器学习横跨计算机科学、工程技术和统计学等多个学科，需要多学科的
HTML5+CSS实现图片3D旋转效果，附音乐宁醉小白 html5 前端 html
利用程序呈现图片，可以俘获一众女生的心，增加音乐可以实现图片变化的同时也带上了想要得到效果，如此一程序实乃众人之喜。先看看程序呈现的效果，还是特别吸引人的。先在网上爬取想要呈现的美女照片，存放在文件夹img-one，与程序路径一致。图片像素需进行调整，同一面图片可以使用同一个图片，保持图片像素一致的同时也增加了立体感。第二张02.jpg和2.jpg可以倒着放，这样在程序实现的时候，可以和其他方向的
炫酷的HTML5粒子动画特效实现详解木木黄木木 html5 前端 html
炫酷的HTML5粒子动画特效实现详解这里写目录标题炫酷的HTML5粒子动画特效实现详解项目介绍技术栈项目架构1.HTML结构2.样式设计核心实现1.粒子类设计2.动画效果实现星空效果烟花效果雨滴效果3.鼠标交互性能优化效果展示总结项目介绍本文将详细介绍如何使用HTML5Canvas技术实现一个炫酷的粒子动画特效系统。该系统包含三种不同的动画效果：星空、烟花和雨滴，并支持鼠标交互功能，能够为网页增添
SQL自学：怎么创建视图 m0_74823471 面试学习路线阿里巴巴 sql 数据库
在SQL中，视图是一种虚拟表，它是基于一个或多个表的查询结果集。视图并不实际存储数据，而是在每次查询时动态生成结果。一、创建视图的语法（以MySQL为例）CREATEVIEWview_nameASSELECTcolumn1,column2,...FROMtable_nameWHEREcondition;view_name：是要创建的视图的名称。column1,column2,...：要在视图中显示
代码随想录day7-链表俩数相加凌凡天链表数据结构 java 算法 leetcode
给你两个非空的链表，表示两个非负的整数。它们每位数字都是按照逆序的方式存储的，并且每个节点只能存储一位数字。请你将两个数相加，并以相同形式返回一个表示和的链表。你可以假设除了数字0之外，这两个数都不会以0开头。示例1：输入：l1=[2,4,3],l2=[5,6,4]输出：[7,0,8]解释：342+465=807.示例2：输入：l1=[0],l2=[0]输出：[0]示例3：输入：l1=[9,9,9
数据挖掘实战-基于机器学习的垃圾邮件检测模型艾派森数据挖掘实战合集数据挖掘机器学习人工智能 python
‍♂️个人主页：@艾派森的个人主页✍作者简介：Python学习者希望大家多多支持，我们一起进步！如果文章对你有帮助的话，欢迎评论点赞收藏加关注+目录1.项目背景2.数据集介绍
conda篇----在已有conda环境的基础上升级python包心惠天意 conda python jvm
conda篇----在已有conda环境的基础上升级python包原先的python版本第一步：condaupdate--all(py11)[xxx@aivrs01xxx]$condaupdate--allCollectingpackagemetadata(current_repodata.json):doneSolvingenvironment:done==>WARNING:Anewervers
SQL数据更新小王Jacky 数据库学习 sql 数据库
1.插入数据**(1)插入单个元组**--向学生表S插入一条学生记录INSERTINTOS(SNO,SN,SEX,AGE,DEPT)VALUES('S001','张三','男',20,'计算机系');--向选课表SC插入一条选课记录INSERTINTOSC(SNO,CNO,SCORE)VALUES('S001','C001',85);**(2)插入多个元组**--向课程表C插入多条课程记录INSE
PTA:数组区间差集A-B 悦悦子a啊 C语言PTA习题算法数据结构
输入一个整数数组（题目保证数组中的整数不重复），然后输入两个闭区间，区间A[L1,R1]和区间B[L2,R2]，L1,R1代表区间A的左/右边界的数组下标；L2,R2代表区间B的左/右边界的数组下标（下标从0开始，且题目保证下标均在有效范围内不会越界）。输出由这两个区间形成的差集（A-B）的数值。输入格式:一个整数n（n不大于20），然后输入n个整数。最后输入4个整数，分别代表区间A和区间B的左边
环境配置（1）：笔记本window、虚拟机ubuntu、开发板三者互ping通信，并且虚拟机ubuntu和开发板能上网 lishing6 ubuntu linux mcu 嵌入式硬件 arm开发物联网硬件工程
1.配置网络我们配置网络是为了方便后续调试开发板系统或者应用程序时，能够使用tftp协议nfs协议等拷贝文件，以及设置文件系统启动方式为nfs挂载启动。2.设置Ubuntu使用NAT网络NAT是什么意思？NetworkAddressTranslation，网络地址转换。举个例子，在NAT里，Windows就是一个爱护孩子的父亲，Ubuntu就是受保护的小孩。小孩要买东西，都由他父亲代劳，别人根本不
systemd-networkd 的 *.network 配置文件详解笔记250323 kfepiza 网络通讯传输协议物联 #控制台命令行 Shell脚本 sh cmd 等 #Linux CentOS Ubuntu 等笔记 tcp/ip 网络 linux
systemd-networkd的*.network配置文件详解笔记250323查看官方文档可以用mansystemd.network命令,或访问:https://www.freedesktop.org/software/systemd/man/latest/systemd.network.html名称systemd.network—网络配置概要network.network描述一个纯INI风格的
systemd-networkd NetworkManager 介绍追心嵌入式 linux
systemd-networkd和NetworkManager的详细介绍systemd-networkd和NetworkManager都是Linux系统中常用的网络管理工具，但它们的设计目标和使用场景不同。以下是它们的详细介绍、功能、使用场景和差异。1.systemd-networkdsystemd-networkd是一个由systemd提供的网络管理工具，旨在为Linux系统提供网络配置和管理的
租赁APP开发的全攻略和市场潜力分析红点聊租赁其他
内容概要在当今快节奏的生活中，租赁APP似乎成为了我们日常生活的“新宠”。它不仅为个人提供了便利，也为商家开辟了一片广阔的蓝海。要想在这一领域取得成功，首先得做好市场调研。了解用户需求、竞争对手和市场动态是必不可少的。接下来是核心功能设计，我们需要考虑如何让用户更方便地找到、租赁商品，同时优化用户体验，让每一位潜在用户都能心甘情愿地停留与互动。为了更直观地理解市场，我们可以参考以下表格：行业增长率
集成学习（随机森林） herry57 数学建模大数据随机森林集成学习
目录一、集成学习概念二、Bagging集成原理三、随机森林四、例子（商品分类）一、集成学习概念集成学习通过建⽴⼏个模型来解决单⼀预测问题。它的⼯作原理是⽣成多个分类器/模型，各⾃独⽴地学习和作出预测。这些预测最后结合成组合预测，因此优于任何⼀个单分类的做出预测。只要单分类器的表现不太差，集成学习的结果总是要好于单分类器的二、Bagging集成原理分类圆形和长方形三、随机森林在机器学习中，随机森林是
HTML实现酷炫3D相册算法与编程之美编程之美 css html js css3 javascript
欢迎点击「算法与编程之美」↑关注我们！本文首发于微信公众号："算法与编程之美"，欢迎关注，及时了解更多此系列文章。欢迎加入团队圈子！与作者面对面！直接点击！目录1、创建文件目录2、调背景色3、制作3D相册4、将图片散开，围成一圈。5、绘制透明底盘6、最终效果1、创建文件目录在Hbuilder在新建一个目录，创建css和js文件。图12、调背景色在style块里面给整个页面渲染成黑色调。*{padd
新需求如何实现火火PM打怪中考公笔记笔记
作为产品经理，面对新需求时，我会结合产品管理和项目管理的双重逻辑，采用以下结构化流程，确保需求既能满足用户价值，又能高效落地：一、需求澄清与价值验证（NPDP核心逻辑）需求背景挖掘与需求提出方（用户/业务/领导）深度沟通，明确：痛点场景：需求解决的具体问题（例如“政务数据共享接口调用失败率高”）。期望目标：量化成功标准（如“接口成功率从70%提升至95%”）。工具：5W1H分析法、用户故事地图（U
【Linux 下的 bash 无法正常解析, Windows 的 CRLF 换行符问题导致的】待磨的钝刨 linux bash windows
文章目录报错原因：解决办法：方法一：用`dos2unix`修复方法二：手动转换换行符方法三：VSCode或其他编辑器手动改总结这个错误很常见，原因是你的wait_for_gpu.sh脚本文件格式不对，具体来说是Windows的CRLF换行符问题导致的，Linux下的bash无法正常解析。hadoop@hadoop:~/anaconda3$bashwait_for_gpu.sh:invalidopt
CSS动画：逐帧动画与steps()函数双囍菜菜前端随记 css 前端
逐帧动画与steps()函数：精准掌控动画节奏关键词：steps()函数、雪碧图、精灵动画、帧动画优化文章目录逐帧动画与steps()函数：精准掌控动画节奏一、逐帧动画的本质：时间函数的维度突破1.1线性动画的局限性1.2steps()函数数学解析二、视觉化解析：steps()工作原理2.1时间轴切片演示2.2与线性动画对比三、商业级案例：RPG游戏角色行走动画3.1雪碧图制作规范3.2完整实现代
conda：一个当下最流行的Python虚拟环境工具 Wang_AI
点击上方“AI派”，选择“设为星标”最新分享，第一时间送达！作者：LeonWang，现为中科院特别研究助理(博士后)，在AI、数据科学和科学计算等方面相关的工程实践上积累了丰富的经验。编辑：王老湿前面的文章中，为大家介绍过Python下的虚拟环境和包管理。在实际中，更为流行的是用Conda来管理Python环境。今天这篇文章就为大家介绍这方面的相关内容。Conda环境Conda简介Conda是目前
【Html+CSS】3D旋转相册小木荣 web前端 css html 3d
3D旋转木马相册&3D盒子相册因为代码大部分相同，就放一起了注释一下就是另一个相册3D旋转木马相册body{background-color:#000;/*视距，使子元素获得视距效果*/perspective:900px;}section{margin:20vhauto;position:relative;width:200px;height:200px;/*开启3D空间*/transform-s
解决loguru在multiprocessing并发场景下子线程日志无法记录到文件问题 dkgee python 开发语言
logger对象在创建时必须从父进程传递给子进程才能记录日志，有两种传递方式：第一种：通过类定义一个日志变量，然后外部传入logger对象来记录日志。#workers_a.pyclassWorker:_logger=None@staticmethoddefset_logger(logger_):Worker._logger=logger_defwork(self,x):self._logger.i
Dom 周华华 JavaScript html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml&q
【Spark九十六】RDD API之combineByKey bit1129 spark
1. combineByKey函数的运行机制 RDD提供了很多针对元素类型为(K,V)的API，这些API封装在PairRDDFunctions类中，通过Scala隐式转换使用。这些API实现上是借助于combineByKey实现的。combineByKey函数本身也是RDD开放给Spark开发人员使用的API之一首先看一下combineByKey的方法说明：
msyql设置密码报错：ERROR 1372 (HY000): 解决方法详解 daizj mysql 设置密码
MySql给用户设置权限同时指定访问密码时，会提示如下错误： ERROR 1372 (HY000): Password hash should be a 41-digit hexadecimal number；问题原因：你输入的密码是明文。不允许这么输入。解决办法：用select password('你想输入的密码');查询出你的密码对应的字符串，然后
路漫漫其修远兮吾将上下而求索周凡杨学习思索
王国维在他的《人间词话》中曾经概括了为学的三种境界古今之成大事业、大学问者，罔不经过三种之境界。“昨夜西风凋碧树。独上高楼，望尽天涯路。”此第一境界也。“衣带渐宽终不悔，为伊消得人憔悴。”此第二境界也。“众里寻他千百度，蓦然回首，那人却在灯火阑珊处。”此第三境界也。学习技术，这也是你必须经历的三种境界。第一层境界是说，学习的路是漫漫的，你必须做好充分的思想准备，如果半途而废还不如不要开始。这里，注
Hadoop(二)对话单的操作朱辉辉33 hadoop
Debug： 1、 A = LOAD '/user/hue/task.txt' USING PigStorage(' ') AS (col1,col2,col3); DUMP A; //输出结果前几行示例： (>ggsnPDPRecord(21),,) (-->recordType(0),,) (-->networkInitiation(1),,)
web报表工具FineReport常用函数的用法总结（日期和时间函数）老A不折腾 finereport 报表工具 web开发
web报表工具FineReport常用函数的用法总结（日期和时间函数）说明：凡函数中以日期作为参数因子的，其中日期的形式都必须是yy/mm/dd。而且必须用英文环境下双引号(" ")引用。 DATE DATE(year,month,day):返回一个表示某一特定日期的系列数。 Year:代表年，可为一到四位数。 Month:代表月份。
c++ 宏定义中的##操作符墙头上一根草 C++
#与##在宏定义中的--宏展开 #include <stdio.h> #define f(a,b) a##b #define g(a) #a #define h(a) g(a) int main() { &nbs
分析Spring源代码之，DI的实现 aijuans spring DI 现源代码
(转) 分析Spring源代码之，DI的实现 2012/1/3 by tony 接着上次的讲，以下这个sample [java] view plain copy print
for循环的进化 alxw4616 JavaScript
// for循环的进化 // 菜鸟 for (var i = 0; i < Things.length ; i++) { // Things[i] } // 老鸟 for (var i = 0, len = Things.length; i < len; i++) { // Things[i] } // 大师 for (var i = Things.le
网络编程Socket和ServerSocket简单的使用百合不是茶网络编程基础 IP地址端口
网络编程;TCP/IP协议网络:实现计算机之间的信息共享,数据资源的交换协议:数据交换需要遵守的一种协议,按照约定的数据格式等写出去端口:用于计算机之间的通信每运行一个程序，系统会分配一个编号给该程序，作为和外界交换数据的唯一标识 0~65535 查看被使用的
JDK1.5 生产消费者 bijian1013 java thread 生产消费者 java多线程
ArrayBlockingQueue：一个由数组支持的有界阻塞队列。此队列按 FIFO（先进先出）原则对元素进行排序。队列的头部是在队列中存在时间最长的元素。队列的尾部是在队列中存在时间最短的元素。新元素插入到队列的尾部，队列检索操作则是从队列头部开始获得元素。 ArrayBlockingQueue的常用方法：
JAVA版身份证获取性别、出生日期及年龄 bijian1013 java 性别出生日期年龄
工作中需要根据身份证获取性别、出生日期及年龄，且要还要支持15位长度的身份证号码，网上搜索了一下，经过测试好像多少存在点问题，干脆自已写一个。 CertificateNo.java package com.bijian.study; import java.util.Calendar; import
【Java范型六】范型与枚举 bit1129 java
首先，枚举类型的定义不能带有类型参数，所以，不能把枚举类型定义为范型枚举类，例如下面的枚举类定义是有编译错的 public enum EnumGenerics<T> { //编译错，提示枚举不能带有范型参数 OK, ERROR; public <T> T get(T type) { return null;
【Nginx五】Nginx常用日志格式含义 bit1129 nginx
1. log_format 1.1 log_format指令用于指定日志的格式，格式： log_format name(格式名称) type(格式样式) 1.2 如下是一个常用的Nginx日志格式： log_format main '[$time_local]|$request_time|$status|$body_bytes
Lua 语言 15 分钟快速入门 ronin47 lua 基础
- - 单行注释 - - [[ [多行注释] - - ]] - - - - - - - - - - - 1. 变量 & 控制流 - - - - - - - - - - num = 23 - - 数字都是双精度 str = 'aspythonstring'
java-35.求一个矩阵中最大的二维矩阵 ( 元素和最大 ) bylijinnan java
the idea is from: http://blog.csdn.net/zhanxinhang/article/details/6731134 public class MaxSubMatrix { /**see http://blog.csdn.net/zhanxinhang/article/details/6731134 * Q35 求一个矩阵中最大的二维
mongoDB文档型数据库特点开窍的石头 mongoDB文档型数据库特点
MongoDD: 文档型数据库存储的是Bson文档-->json的二进制特点：内部是执行引擎是js解释器，把文档转成Bson结构，在查询时转换成js对象。 mongoDB传统型数据库对比传统类型数据库：结构化数据，定好了表结构后每一个内容符合表结构的。也就是说每一行每一列的数据都是一样的文档型数据库：不用定好数据结构，
[毕业季节]欢迎广大毕业生加入JAVA程序员的行列 comsci java
一年一度的毕业季来临了。。。。。。。。正在投简历的学弟学妹们。。。如果觉得学校推荐的单位和公司不适合自己的兴趣和专业，可以考虑来我们软件行业，做一名职业程序员。。。软件行业的开发工具中，对初学者最友好的就是JAVA语言了，网络上不仅仅有大量的
PHP操作Excel – PHPExcel 基本用法详解 cuiyadll PHP Excel
导出excel属性设置//Include classrequire_once('Classes/PHPExcel.php');require_once('Classes/PHPExcel/Writer/Excel2007.php');$objPHPExcel = new PHPExcel();//Set properties 设置文件属性$objPHPExcel->getProperties
IBM Webshpere MQ Client User Issue (MCAUSER) darrenzhu IBM jms user MQ MCAUSER
IBM MQ JMS Client去连接远端MQ Server的时候，需要提供User和Password吗？答案是根据情况而定，取决于所定义的Channel里面的属性Message channel agent user identifier (MCAUSER)的设置。 http://stackoverflow.com/questions/20209429/how-mca-user-i
网线的接法 dcj3sjt126com
一、PC连HUB (直连线)A端：（标准568B）：白橙，橙，白绿，蓝，白蓝，绿，白棕，棕。 B端：（标准568B）：白橙，橙，白绿，蓝，白蓝，绿，白棕，棕。二、PC连PC （交叉线）A端：(568A)：白绿，绿，白橙，蓝，白蓝，橙，白棕，棕； B端：（标准568B）：白橙，橙，白绿，蓝，白蓝，绿，白棕，棕。三、HUB连HUB&nb
Vimium插件让键盘党像操作Vim一样操作Chrome dcj3sjt126com chrome vim
什么是键盘党？键盘党是指尽可能将所有电脑操作用键盘来完成，而不去动鼠标的人。鼠标应该说是新手们的最爱，很直观，指哪点哪，很听话！不过常常使用电脑的人，如果一直使用鼠标的话，手会发酸，因为操作鼠标的时候，手臂不是在一个自然的状态，臂肌会处于绷紧状态。而使用键盘则双手是放松状态，只有手指在动。而且尽量少的从鼠标移动到键盘来回操作，也省不少事。在chrome里安装 vimium 插件
MongoDB查询（2）——数组查询[六] eksliang mongodb MongoDB查询数组
MongoDB查询数组转载请出自出处：http://eksliang.iteye.com/blog/2177292 一、概述 MongoDB查询数组与查询标量值是一样的，例如，有一个水果列表，如下所示： > db.food.find() { "_id" : "001", "fruits" : [ "苹
cordova读写文件（1） gundumw100 JavaScript Cordova
使用cordova可以很方便的在手机sdcard中读写文件。首先需要安装cordova插件：file 命令为： cordova plugin add org.apache.cordova.file 然后就可以读写文件了，这里我先是写入一个文件，具体的JS代码为： var datas=null;//datas need write var directory=&
HTML5 FormData 进行文件jquery ajax 上传到又拍云 ileson jquery Ajax html5 FormData
html5 新东西：FormData 可以提交二进制数据。页面test.html <!DOCTYPE> <html> <head> <title> formdata file jquery ajax upload</title> </head> <body> <
swift appearanceWhenContainedIn:(version1.2 xcode6.4) 啸笑天 version
swift1.2中没有oc中对应的方法： + (instancetype)appearanceWhenContainedIn:(Class <UIAppearanceContainer>)ContainerClass, ... NS_REQUIRES_NIL_TERMINATION; 解决方法：在swift项目中新建oc类如下： #import &
java实现SMTP邮件服务器 macroli java 编程
电子邮件传递可以由多种协议来实现。目前，在Internet 网上最流行的三种电子邮件协议是SMTP、POP3 和 IMAP，下面分别简单介绍。　　◆ SMTP 协议　　简单邮件传输协议(Simple Mail Transfer Protocol,SMTP)是一个运行在TCP/IP之上的协议，用它发送和接收电子邮件。SMTP 服务器在默认端口25上监听。SMTP客户使用一组简单的、基于文本的
mongodb group by having where 查询sql qiaolevip 每天进步一点点学习永无止境 mongo 纵观千象
SELECT cust_id, SUM(price) as total FROM orders WHERE status = 'A' GROUP BY cust_id HAVING total > 250 db.orders.aggregate( [ { $match: { status: 'A' } }, { $group: {
Struts2 Pojo（六） Luob. POJO strust2
注意：附件中有完整案例 1.采用POJO对象的方法进行赋值和传值 2.web配置 <?xml version="1.0" encoding="UTF-8"?> <web-app version="2.5" xmlns="http://java.sun.com/xml/ns/javaee&q
struts2步骤 wuai struts
1、添加jar包 2、在web.xml中配置过滤器 <filter> <filter-name>struts2</filter-name> <filter-class>org.apache.st

深度学习之卷积神经网络(CNN)详解与代码实现（一）

卷积神经网络(CNN)详解与代码实现

目录

1.应用场景

2.卷积神经网络结构

3.代码实现流程图以及介绍

4.代码实现（python3.6）

5.运行结果以及分析

6.参考文献

1.应用场景

2.卷积神经网络结构

2.1 卷积（convolution）

2.2 Relu激活函数

2.3 池化

2.4 全连接（full connection）

2.5 损失函数（softmax_loss）

2.6 前向传播（forward propagation）

2.7 反向传播（back propagation）

2.8 随机梯度下降（sgd_momentum）

3.代码实现流程图以及介绍

4.代码实现

5.运行结果以及分析

6. 参考文献

你可能感兴趣的:(深度学习之卷积神经网络(CNN)详解与代码实现（一）)