撸码的xiao摩羯

第05章深度卷积神经网络模型

序言

1. 内容介绍

本章介绍深度学习算法-卷积神经网络用于 图片分类 的应用，主要介绍主流深度卷积神经网络 (CNN) 模型，包括 ResNet DenseNet SeNet 的算法模型、数学推理、模型实现 以及 PyTorch框架 的实现。并能够把它应用于现实世界的 数据集 实现分类效果。

2. 理论目标

ResNet 的基础模型架构、训练细节与数学推理
DenseNet 的基础模型架构、训练细节与数学推理
SeNet 的基础模型架构、训练细节与数学推理

3. 实践目标

掌握PyTorch框架下ResNet DenseNet SeNet的实现
掌握迁移学习与特征提取
熟悉各经典算法在图像分类应用上的优缺点

4. 实践数据集

Flower 数据集分类
Oxford-IIIT 数据集分类
CIFAR-10 数据集分类

5. 内容目录

1.卷积神经网络模型详解 ResNet
2.卷积神经网络模型详解 DenseNet
3.PyTorch 实践
4.图像分类实例 Flower 数据集
5.图像分类实例 Oxford-IIIT 数据集
6.图像分类实例 CIFAR-10 数据集

第1节卷积神经网络模型 ResNet

1.1 ResNet 简介

ResNet 残差神经网络是由微软研究院的何恺明、张祥雨、任少卿、孙剑等人提出，以一种残差学习框架来解决 网络退化 问题，从而训练更深的网络。这种框架可以结合已有的各种网络结构，充分发挥二者的优势。在 ImageNet 分类数据集中，拥有152层的残差网络，以 3.75% top-5 的错误率获得了ILSVRC 2015 分类比赛的冠军。同时很多证据表明残差学习是通用的，不仅可以应用于视觉问题，也可应用于非视觉问题。

ResNet 的主要贡献是发现了退化现象(Degradation)，并针对退化现象发明了快捷连接 (Shortcut connection)，极大的消除了深度过大的神经网络训练困难问题。神经网络的深度首次突破了100层、最大的神经网络甚至超过了1000层。ResNet以三种方式挑战了传统的神经网络架构：

ResNet 通过引入 快捷连接 来绕过残差层，这允许数据直接流向任何后续层。这与传统的、顺序的 pipeline 形成鲜明对比：传统的架构中，网络依次处理低级feature 到高级feature 。
ResNet 的层数非常深，最高达1202层。而 ALexNet 这样的架构，网络层数要小两个量级
ResNet 中去掉单个层并不会影响其预测性能。而训练好的 AlexNet 等网络中，移除层会导致预测性能损失

1.2 网络退化问题

学习更深的网络的一个障碍是 梯度消失 与 梯度爆炸，该问题可以通过Batch Normalization 在很大程度上解决。但 ResNet 作者 何凯明 发现随着网络的深度的增加，准确率达到饱和之后迅速下降，而这种下降不是由过拟合引起的，这称作网络退化问题。如果更深的网络训练误差更大，则说明是由于优化算法引起的，即越深的网络，求解优化问题越难。如下所示更深的网络导致更高的训练误差和测试误差

这个现象与 “越深的网络准确率越高的信念” 显然是矛盾的、冲突的。理论上讲，较深的模型不应该比和它对应的、较浅的模型更差。因为较深的模型是较浅的模型的超空间。较深的模型可以通过先构建较浅的模型，然后添加很多恒等映射的网络层。但实际上我们的较深的模型后面添加的不是恒等映射，而是一些 非线性层。

ResNet 团队把退化现象归因为深层神经网络难以实现 恒等变换 (y=x), 同时表明通过多个非线性层来近似横等映射可能是困难的。于是，ResNet 团队在残差网络模块中增加了快捷连接分支，在 线性转换和非线性转换 之间寻求一个平衡，以解决网络退化问题。

1.3 ResNet 模型结构

1.3.1 ResNet 块

假设需要学习的是映射 \vec{y} = H(\vec{x})y=H(x) ，残差块使用堆叠的非线性层拟合残差：\vec{y} = F(\vec{x}, W_i) + \vec{x}y=F(x,Wi)+x , 其中

\vec{x}x 和 \vec{y}y 是块的输入和输出向量
F(\vec{x}, W_i)F(x,Wi) 是要学习的残差映射。因为 F(\vec{x}, W_i) = H(\vec{x}) - \vec{x}F(x,Wi)=H(x)−x ，因此称 FF 为残差
通过 快捷连接 (shortcut) 逐个元素相加来执行。快捷连接指的是那些跳过一层或者更多层的连接
- 快捷连接简单的执行恒等映射，并将其输出添加到堆叠层的输出
- 快捷连接既不增加额外的参数，也不增加计算复杂度
- 相加之后通过非线性激活函数，这可以视作对整个残差块添加非线性，即 relu(\vec{y})relu(y)

残差块隐含了一个假设：F(\vec{x}, W_i)F(x,Wi) 和 \vec{x}x 的维度相等。如果它们的维度不等，则需要在快捷连接中对 \vec{x}x 执行线性投影来匹配维度：\vec{y} = F(\vec{x}, W_i) + W_s\vec{x}y=F(x,Wi)+Wsx , 事实上当它们维度相等时，也可以执行线性变换。但是实践表明使用 恒等映射足以解决退化问题 ，而使用线性投影会增加参数和计算复杂度。因此 W_sWs 仅在匹配维度时使用

残差函数 FF 的形式是可变的

层数可变：论文中的实验包含有两层堆叠、三层堆叠，实际任务中也可以包含更多层的堆叠。如果 FF 只有一层，则残差块退化线性层为 \vec{y} = W\vec{x} + \vec{x}y=Wx+x
连接形式可变：不仅可用于全连接层，也可用于卷积层。此时 FF 代表多个卷积层的堆叠，而最终的逐元素加法 ++ 在两个 feature map 上逐通道进行

残差学习成功的原因：学习残差 F(\vec{x}, W_i)F(x,Wi) 比学习原始映射 H(\vec{x})H(x) 要更容易

当原始映射 HH 就是一个恒等映射时，FF 就是一个零映射。此时求解器只需要简单的将堆叠的非线性连接的权重推向零即可。实际任务中原始映射 HH 可能不是一个恒等映射：
- 如果原始映射 HH 更偏向于恒等映射（而不是更偏向于非恒等映射），则 FF 就是关于恒等映射的抖动，会更容易学习
- 如果原始映射 HH 更偏向于零映射，那么学习 HH 本身要更容易。但是在实际应用中，零映射非常少见，因为它会导致输出全为0
如果原始映射 HH 是一个非恒等映射，则可以考虑对残差模块使用缩放因子。如 Inception-Resnet 中在残差模块与快捷连接叠加之前，对残差进行缩放
可以通过观察残差 FF 的输出来判断：如果 FF 的输出均为 0 附近的、较小的数，则说明原始映射 HH 更偏向于恒等映射；否则，说明原始映射 HH 更偏向于非横等映射

1.3.2 ResNet 模型网络

Plain 网络为一些简单网络结构的叠加，如下图中给出了四种 Plain 网络，它们的区别主要是网络深度不同。其中，输入图片尺寸均为 224\times224224×224

ResNet 简单的在 Plain 网络上添加快捷连接来实现

FLOPs：floating point operations 的缩写，意思是浮点运算量，用于衡量算法/模型的复杂度

FLOPS：floating point per second的缩写，意思是每秒浮点运算次数，用于衡量计算速度

相对于输入的 feature map，残差块的输出 feature map 尺寸可能会发生变化

输出 feature map 的通道数增加, 此时需要扩充快捷连接的输出 feature map, 否则快捷连接的输出 feature map 无法和残差块的 feature map 累加
- 直接通过 0 来填充需要扩充的维度，在图中以实线标识
- 通过1x1 卷积来扩充维度，在图中以虚线标识
输出 feature map 的尺寸减半, 此时需要对快捷连接执行步长为 2 的 池化/卷积 ：如果快捷连接已经采用 1\times11×1 卷积，则该卷积步长为 2 ；否则采用步长为 2 的最大池化

1.4 ResNet PyTorch

# %load res.py import math import torch import torch.nn as nn class Bottleneck(nn.Module): def __init__(self,inplanes,planes,downsample = None,stride = 1): super(Bottleneck, self).__init__() self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, stride=1, bias=False) self.bn1 = nn.BatchNorm2d(planes) self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride, padding=1, bias=False) self.bn2 = nn.BatchNorm2d(planes) self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, stride=1, bias=False) self.bn3 = nn.BatchNorm2d(planes * 4) self.relu = nn.ReLU(inplace=True) self.downsample = downsample def forward(self, x): identity = x out = self.conv1(x) out = self.bn1(out) out = self.relu(out) out = self.conv2(out) out = self.bn2(out) out = self.relu(out) out = self.conv3(out) out = self.bn3(out) if self.downsample is not None: identity = self.downsample(x) out += identity out = self.relu(out) return out class ResNet50(nn.Module): def __init__(self, num_classes=1000): super(ResNet50, self).__init__() self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2,padding=3,bias=False) self.bn1 = nn.BatchNorm2d(64) self.relu = nn.ReLU(inplace=True) self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1) self.layer1 = self._make_stage(64,64,1,3) self.layer2 = self._make_stage(256,128,2,4) self.layer3 = self._make_stage(512,256,2,6) self.layer4 = self._make_stage(1024,512,2,3) self.avgpool = nn.AdaptiveAvgPool2d((1, 1)) self.fc = nn.Linear(512 * 4, num_classes) def _make_stage(self, inplanes, planes, stride, num_blocks): downsample = nn.Sequential( nn.Conv2d(inplanes, planes * 4, kernel_size=1, stride=stride, bias=False), nn.BatchNorm2d(planes*4) ) layers = [] layers.append(Bottleneck(inplanes, planes, downsample, stride)) for _ in range(1, num_blocks): layers.append(Bottleneck(4*planes, planes)) return nn.Sequential(*layers) def forward(self, x): x = self.conv1(x) x = self.bn1(x) x = self.relu(x) x = self.maxpool(x) x = self.layer1(x) x = self.layer2(x) x = self.layer3(x) x = self.layer4(x) x = self.avgpool(x) x = torch.flatten(x, 1) x = self.fc(x) return x def build_res50(phase,num_classes,pretrained): if phase != "test" and phase != "train": print("ERROR: Phase: " + phase + " not recognized") return if not pretrained: model = ResNet50(num_classes=num_classes) else: model = ResNet50() model_weights_path = 'weights/resnet50-19c8e357.pth' model.load_state_dict(torch.load(model_weights_path), strict=False) for parma in model.parameters(): parma.requires_grad = False ratio = int(math.sqrt(2048/num_classes)) floor = math.floor(math.log2(ratio)) hidden_size = int(math.pow(2,11-floor)) model.fc = nn.Sequential( nn.Linear(2048, hidden_size), nn.ReLU(inplace=True), nn.Dropout(0.5), nn.Linear(hidden_size,num_classes)) return model

from torchsummary import summary net = build_res50('train',10,False) net.cuda() summary(net,(3,224,224))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1         [-1, 64, 112, 112]           9,408
       BatchNorm2d-2         [-1, 64, 112, 112]             128
              ReLU-3         [-1, 64, 112, 112]               0
         MaxPool2d-4           [-1, 64, 56, 56]               0
            Conv2d-5           [-1, 64, 56, 56]           4,096
       BatchNorm2d-6           [-1, 64, 56, 56]             128
              ReLU-7           [-1, 64, 56, 56]               0
            Conv2d-8           [-1, 64, 56, 56]          36,864
       BatchNorm2d-9           [-1, 64, 56, 56]             128
             ReLU-10           [-1, 64, 56, 56]               0
           Conv2d-11          [-1, 256, 56, 56]          16,384
      BatchNorm2d-12          [-1, 256, 56, 56]             512
           Conv2d-13          [-1, 256, 56, 56]          16,384
      BatchNorm2d-14          [-1, 256, 56, 56]             512
             ReLU-15          [-1, 256, 56, 56]               0
       Bottleneck-16          [-1, 256, 56, 56]               0
           Conv2d-17           [-1, 64, 56, 56]          16,384
      BatchNorm2d-18           [-1, 64, 56, 56]             128
             ReLU-19           [-1, 64, 56, 56]               0
           Conv2d-20           [-1, 64, 56, 56]          36,864
      BatchNorm2d-21           [-1, 64, 56, 56]             128
             ReLU-22           [-1, 64, 56, 56]               0
           Conv2d-23          [-1, 256, 56, 56]          16,384
      BatchNorm2d-24          [-1, 256, 56, 56]             512
             ReLU-25          [-1, 256, 56, 56]               0
       Bottleneck-26          [-1, 256, 56, 56]               0
           Conv2d-27           [-1, 64, 56, 56]          16,384
      BatchNorm2d-28           [-1, 64, 56, 56]             128
             ReLU-29           [-1, 64, 56, 56]               0
           Conv2d-30           [-1, 64, 56, 56]          36,864
      BatchNorm2d-31           [-1, 64, 56, 56]             128
             ReLU-32           [-1, 64, 56, 56]               0
           Conv2d-33          [-1, 256, 56, 56]          16,384
      BatchNorm2d-34          [-1, 256, 56, 56]             512
             ReLU-35          [-1, 256, 56, 56]               0
       Bottleneck-36          [-1, 256, 56, 56]               0
           Conv2d-37          [-1, 128, 56, 56]          32,768
      BatchNorm2d-38          [-1, 128, 56, 56]             256
             ReLU-39          [-1, 128, 56, 56]               0
           Conv2d-40          [-1, 128, 28, 28]         147,456
      BatchNorm2d-41          [-1, 128, 28, 28]             256
             ReLU-42          [-1, 128, 28, 28]               0
           Conv2d-43          [-1, 512, 28, 28]          65,536
      BatchNorm2d-44          [-1, 512, 28, 28]           1,024
           Conv2d-45          [-1, 512, 28, 28]         131,072
      BatchNorm2d-46          [-1, 512, 28, 28]           1,024
             ReLU-47          [-1, 512, 28, 28]               0
       Bottleneck-48          [-1, 512, 28, 28]               0
           Conv2d-49          [-1, 128, 28, 28]          65,536
      BatchNorm2d-50          [-1, 128, 28, 28]             256
             ReLU-51          [-1, 128, 28, 28]               0
           Conv2d-52          [-1, 128, 28, 28]         147,456
      BatchNorm2d-53          [-1, 128, 28, 28]             256
             ReLU-54          [-1, 128, 28, 28]               0
           Conv2d-55          [-1, 512, 28, 28]          65,536
      BatchNorm2d-56          [-1, 512, 28, 28]           1,024
             ReLU-57          [-1, 512, 28, 28]               0
       Bottleneck-58          [-1, 512, 28, 28]               0
           Conv2d-59          [-1, 128, 28, 28]          65,536
      BatchNorm2d-60          [-1, 128, 28, 28]             256
             ReLU-61          [-1, 128, 28, 28]               0
           Conv2d-62          [-1, 128, 28, 28]         147,456
      BatchNorm2d-63          [-1, 128, 28, 28]             256
             ReLU-64          [-1, 128, 28, 28]               0
           Conv2d-65          [-1, 512, 28, 28]          65,536
      BatchNorm2d-66          [-1, 512, 28, 28]           1,024
             ReLU-67          [-1, 512, 28, 28]               0
       Bottleneck-68          [-1, 512, 28, 28]               0
           Conv2d-69          [-1, 128, 28, 28]          65,536
      BatchNorm2d-70          [-1, 128, 28, 28]             256
             ReLU-71          [-1, 128, 28, 28]               0
           Conv2d-72          [-1, 128, 28, 28]         147,456
      BatchNorm2d-73          [-1, 128, 28, 28]             256
             ReLU-74          [-1, 128, 28, 28]               0
           Conv2d-75          [-1, 512, 28, 28]          65,536
      BatchNorm2d-76          [-1, 512, 28, 28]           1,024
             ReLU-77          [-1, 512, 28, 28]               0
       Bottleneck-78          [-1, 512, 28, 28]               0
           Conv2d-79          [-1, 256, 28, 28]         131,072
      BatchNorm2d-80          [-1, 256, 28, 28]             512
             ReLU-81          [-1, 256, 28, 28]               0
           Conv2d-82          [-1, 256, 14, 14]         589,824
      BatchNorm2d-83          [-1, 256, 14, 14]             512
             ReLU-84          [-1, 256, 14, 14]               0
           Conv2d-85         [-1, 1024, 14, 14]         262,144
      BatchNorm2d-86         [-1, 1024, 14, 14]           2,048
           Conv2d-87         [-1, 1024, 14, 14]         524,288
      BatchNorm2d-88         [-1, 1024, 14, 14]           2,048
             ReLU-89         [-1, 1024, 14, 14]               0
       Bottleneck-90         [-1, 1024, 14, 14]               0
           Conv2d-91          [-1, 256, 14, 14]         262,144
      BatchNorm2d-92          [-1, 256, 14, 14]             512
             ReLU-93          [-1, 256, 14, 14]               0
           Conv2d-94          [-1, 256, 14, 14]         589,824
      BatchNorm2d-95          [-1, 256, 14, 14]             512
             ReLU-96          [-1, 256, 14, 14]               0
           Conv2d-97         [-1, 1024, 14, 14]         262,144
      BatchNorm2d-98         [-1, 1024, 14, 14]           2,048
             ReLU-99         [-1, 1024, 14, 14]               0
      Bottleneck-100         [-1, 1024, 14, 14]               0
          Conv2d-101          [-1, 256, 14, 14]         262,144
     BatchNorm2d-102          [-1, 256, 14, 14]             512
            ReLU-103          [-1, 256, 14, 14]               0
          Conv2d-104          [-1, 256, 14, 14]         589,824
     BatchNorm2d-105          [-1, 256, 14, 14]             512
            ReLU-106          [-1, 256, 14, 14]               0
          Conv2d-107         [-1, 1024, 14, 14]         262,144
     BatchNorm2d-108         [-1, 1024, 14, 14]           2,048
            ReLU-109         [-1, 1024, 14, 14]               0
      Bottleneck-110         [-1, 1024, 14, 14]               0
          Conv2d-111          [-1, 256, 14, 14]         262,144
     BatchNorm2d-112          [-1, 256, 14, 14]             512
            ReLU-113          [-1, 256, 14, 14]               0
          Conv2d-114          [-1, 256, 14, 14]         589,824
     BatchNorm2d-115          [-1, 256, 14, 14]             512
            ReLU-116          [-1, 256, 14, 14]               0
          Conv2d-117         [-1, 1024, 14, 14]         262,144
     BatchNorm2d-118         [-1, 1024, 14, 14]           2,048
            ReLU-119         [-1, 1024, 14, 14]               0
      Bottleneck-120         [-1, 1024, 14, 14]               0
          Conv2d-121          [-1, 256, 14, 14]         262,144
     BatchNorm2d-122          [-1, 256, 14, 14]             512
            ReLU-123          [-1, 256, 14, 14]               0
          Conv2d-124          [-1, 256, 14, 14]         589,824
     BatchNorm2d-125          [-1, 256, 14, 14]             512
            ReLU-126          [-1, 256, 14, 14]               0
          Conv2d-127         [-1, 1024, 14, 14]         262,144
     BatchNorm2d-128         [-1, 1024, 14, 14]           2,048
            ReLU-129         [-1, 1024, 14, 14]               0
      Bottleneck-130         [-1, 1024, 14, 14]               0
          Conv2d-131          [-1, 256, 14, 14]         262,144
     BatchNorm2d-132          [-1, 256, 14, 14]             512
            ReLU-133          [-1, 256, 14, 14]               0
          Conv2d-134          [-1, 256, 14, 14]         589,824
     BatchNorm2d-135          [-1, 256, 14, 14]             512
            ReLU-136          [-1, 256, 14, 14]               0
          Conv2d-137         [-1, 1024, 14, 14]         262,144
     BatchNorm2d-138         [-1, 1024, 14, 14]           2,048
            ReLU-139         [-1, 1024, 14, 14]               0
      Bottleneck-140         [-1, 1024, 14, 14]               0
          Conv2d-141          [-1, 512, 14, 14]         524,288
     BatchNorm2d-142          [-1, 512, 14, 14]           1,024
            ReLU-143          [-1, 512, 14, 14]               0
          Conv2d-144            [-1, 512, 7, 7]       2,359,296
     BatchNorm2d-145            [-1, 512, 7, 7]           1,024
            ReLU-146            [-1, 512, 7, 7]               0
          Conv2d-147           [-1, 2048, 7, 7]       1,048,576
     BatchNorm2d-148           [-1, 2048, 7, 7]           4,096
          Conv2d-149           [-1, 2048, 7, 7]       2,097,152
     BatchNorm2d-150           [-1, 2048, 7, 7]           4,096
            ReLU-151           [-1, 2048, 7, 7]               0
      Bottleneck-152           [-1, 2048, 7, 7]               0
          Conv2d-153            [-1, 512, 7, 7]       1,048,576
     BatchNorm2d-154            [-1, 512, 7, 7]           1,024
            ReLU-155            [-1, 512, 7, 7]               0
          Conv2d-156            [-1, 512, 7, 7]       2,359,296
     BatchNorm2d-157            [-1, 512, 7, 7]           1,024
            ReLU-158            [-1, 512, 7, 7]               0
          Conv2d-159           [-1, 2048, 7, 7]       1,048,576
     BatchNorm2d-160           [-1, 2048, 7, 7]           4,096
            ReLU-161           [-1, 2048, 7, 7]               0
      Bottleneck-162           [-1, 2048, 7, 7]               0
          Conv2d-163            [-1, 512, 7, 7]       1,048,576
     BatchNorm2d-164            [-1, 512, 7, 7]           1,024
            ReLU-165            [-1, 512, 7, 7]               0
          Conv2d-166            [-1, 512, 7, 7]       2,359,296
     BatchNorm2d-167            [-1, 512, 7, 7]           1,024
            ReLU-168            [-1, 512, 7, 7]               0
          Conv2d-169           [-1, 2048, 7, 7]       1,048,576
     BatchNorm2d-170           [-1, 2048, 7, 7]           4,096
            ReLU-171           [-1, 2048, 7, 7]               0
      Bottleneck-172           [-1, 2048, 7, 7]               0
AdaptiveAvgPool2d-173           [-1, 2048, 1, 1]               0
          Linear-174                   [-1, 10]          20,490
================================================================
Total params: 23,528,522
Trainable params: 23,528,522
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.57
Forward/backward pass size (MB): 286.55
Params size (MB): 89.75
Estimated Total Size (MB): 376.88
----------------------------------------------------------------

1.5 ResNet 分析

1.5.1 ResNet 视图分析

考虑从输出 \vec{y}_0y0 到 \vec{y}_3y3 的三个 ResNet 块构建的网络, 根据

\vec{y}_3 = \vec{y}_2 + f_3(\vec{y}_2) = |{\vec{y}_1 + f_2(\vec{y}_1)}| + f_3({\vec{y}_1 + f_2(\vec{y}_1)})y3=y2+f3(y2)=∣y1+f2(y1)∣+f3(y1+f2(y1))

=[\vec{y}_0 + f_1(\vec{y}_0) + f_2(\vec{y}_0 + f_1(\vec{y}_0))] + f_3(\vec{y}_0 + f_1(\vec{y}_0) + f_2(\vec{y}_0 + f_1(\vec{y}_0)))=[y0+f1(y0)+f2(y0+f1(y0))]+f3(y0+f1(y0)+f2(y0+f1(y0)))

下图中左图为原始形式，右图为分解视图。分解视图中展示了数据从输入到输出的多条路径

对于严格顺序的网络，这些网络中的输入总是在单个路径中从第一层直接流到最后一层

分解视图中，每条路径可以通过二进制编码向量 \vec{\textbf{b}} = (b_1,b_2,\dots,b_n), b_i \in {0,1}b=(b1,b2,…,bn),bi∈0,1 来索引：如果流过残差块 f_ifi ，则 b_i=1bi=1 ；如果跳过残差块 f_ifi ，则 b_i=0bi=0，因此ResNet 从输入到输出具有 2^n2n 条路径，第 ii 个残差块 f_ifi 的输入汇聚了之前的 i-1i−1 个残差块的 2^{i-1}2i−1 条路径

普通的前馈神经网络 (FNN) 也可以在单个神经元（而不是网络层）这一粒度上运用分解视图，这也可以将网络分解为不同路径的集合。它与 ResNet 分解的区别是

FNN 的神经元分解视图中，所有路径都具有相同的长度
ResNet 网络的残差块分解视图中，所有路径具有不同的路径长度

1.5.2 ResNet 路径分析

ResNet 中，从输入到输出存在许多条不同长度的路径。这些 路径长度 的分布服从二项分布。对于 nn 层深的ResNet，大多数路径的深度为 \frac{n}{2}2n, 下图为一个 54 个块的 ResNet 网络的路径长度的分布，其中 95% 的路径只包含 19～35 个块

ResNet 中，路径梯度 幅度随着它在反向传播中经过的残差块的数量呈指数减小。因此，训练期间大多数梯度来源于更短的路径。对于一个包含 54 个残差块的 ResNet 网络

下图表示单条长度为 kk 的路径在反向传播到 input 处的梯度的幅度的均值，它刻画了长度为 kk 的单条路径的对于更新的影响, 因为长度为 kk 的路径有多条，因此取其平均值

下图表示长度为 kk 的所有路径在反向传播到 input 处的梯度的幅度的和。它刻画了长度为 kk 的所有路径对于更新的影响。它不仅取决于长度为 kk 的单条路径的对于更新的影响，还取决于长度为 kk 的单条路径的数量

有效路径 指反向传播到 input 处的梯度幅度相对较大的路径。ResNet 中有效路径相对较浅，而且有效路径数量占比较少，在一个 54 个块的 ResNet 网络中

几乎所有的梯度更新都来自于长度为 5~17 的路径
长度为 5~17 的路径占网络所有路径的 0.45%

因此 ResNet 不是让梯度流流通整个网络深度来解决梯度消失问题，而是引入能够在非常深的网络中传输梯度的短路径来避免梯度消失问题，和 ResNet 原理类似，随机深度网络起作用有两个原因

训练期间，网络看到的路径分布会发生变化，主要是变得更短
训练期间，每个 mini-batch 选择不同的短路径的子集，这会鼓励各路径独立地产生良好的结果

1.5.3 ResNet 路径破坏性分析

ResNet 网络训练完成之后，如果随机丢弃单个残差块，则测试误差基本不变。因为移除一个残差块时，ResNet 中路径的数量从 2^n2n 减少到 2^{n-1}2n−1，留下了一半的路径。

VGG 网络训练完成之后，如果随机丢弃单个块，则测试误差急剧上升，预测结果就跟随机猜测差不多。因为移除一个块时，VGG 中唯一可行的路径被破坏。

删除 ResNet 残差块通常会删除长路径。当删除了 kk 个残差块时，长度为 xx 的路径的剩余比例为下式

percent = \frac{C^x_{n-k}}{C^x_n}percent=CnxCn−kx

删除 10 个残差模块，一部分有效路径（路径长度为 5~17）仍然被保留，模型测试性能会部分下降
删除 20 个残差模块，绝大部分有效路径（路径长度为 5~17）被删除，模型测试性能会大幅度下降

第2节卷积神经网络模型 DenseNet

2.1 DenseNet 简介

DenseNet 不是通过更深或者更宽的结构，而是通过 特征重用 来提升网络的学习能力

ResNet 的思想是：创建从 “靠近输入的层” 到 “靠近输出的层” 的直连。而 DenseNet 做得更为彻底, 将所有层以前馈的形式相连，这种网络因此称作DenseNet

DenseNet 具有以下的优点

缓解 梯度消失 的问题。因为每层都可以直接从损失函数中获取梯度、从原始输入中获取信息，从而易于训练
密集连接 还具有正则化的效应，缓解了小训练集任务的过拟合
鼓励 特征重用 ，网络将不同层学到的 feature map 进行组合
大幅度减少 参数数量，因为每层的卷积核尺寸都比较小，输出通道数较少 (由增长率 kk 决定)

DenseNet 具有比传统卷积网络更少的参数，因为它不需要重新学习多余的 feature map

传统的前馈神经网络可以视作在层与层之间传递状态的算法，每一层接收前一层的状态，然后将新的状态传递给下一层。这会改变状态，但是也传递了需要保留的信息

ResNet 通过恒等映射来直接传递需要保留的信息，因此层之间只需要传递状态的变化

DenseNet 会将所有层的状态全部保存到集体知识中，同时每一层增加很少数量的 feture map 到网络的集体知识中
DenseNet 的层很窄（即 feature map 的通道数很小），如每一层的输出只有 12 个通道

2.2 DenseNet 模型结构

2.2.1 DenseNet 块

具有 LL 层的传统卷积网络有 LL 个连接，每层仅仅与 后继层 相连。具有 LL 个残差块的 ResNet 在每个残差块增加了跨层连接，第 ll 个残差块的输出为

\vec{x}_{l+1} = \vec{x}_l + F(\vec{x}_l + W_l)xl+1=xl+F(xl+Wl)

其中 \vec{x}_lxl 是第 ll 个残差块的输入特征；W_l = \{W_{l,k} | 1 \le k \le K\}Wl={Wl,k∣1≤k≤K} 为一组与第 ll 个残差块相关的权重(包括偏置项)，KK 是残差块中的层的数量，FF 代表残差函数

具有 LL 个层块的 DenseNet 块有 \frac{L(L+1)}{2}2L(L+1) 个连接，每层以前馈的方式将该层与它后面的 所有层 相连。对于第 ll 层：所有先前层的 feature map 都作为本层的输入，第 ll 层具有 ll 个输入feature map ；本层输出的feature map 都将作为后面 L-1L−1 层的输入。假设 DenseNet 块包含 LL 层，每一层都实现了一个非线性变换 H_l()Hl() ，其中 ll 表示层的索引。假设DenseNet块的输入为 x_0x0 ，DenseNet 块的第 ll 层的输出为 x_lxl ，则有

x_l = H_l([x_0,x_1,\dots,x_{l-1}])xl=Hl([x0,x1,…,xl−1])

其中 [x_0,x_1,\dots,x_{l-1}][x0,x1,…,xl−1] 表示 0,\dots,l-10,…,l−1 层输出的 feature map 沿着通道方向的拼接。在 ResNet 中，不同 feature map 是通过直接相加来作为块的输出。当 feature map 的尺寸改变时，无法沿着通道方向进行拼接。此时将网络划分为多个 DenseNet 块，每块内部的 feature map 尺寸相同，块之间的feature map 尺寸不同

2.2.1.1 增长率

DenseNet 块中，每层的 H_l()Hl() 输出的 feature map 通道数都相同，都是 kk 个。kk 是一个重要的超参数，称作网络的 增长率。第 ll 层的输入 feature map 的通道数为 k_0 + k \times (l-1)k0+k×(l−1), 其中 k_0k0 为输入层的通道数

DenseNet 不同于现有网络的一个重要地方是 DenseNet 的网络很窄，即输出的 feature map 通道数较小，一个很小的增长率就能够获得不错的效果。一种解释是 DenseNet 块的每层都可以访问块内的所有早前层输出的 feature map，这些f eature map 可以视作 DenseNet 块的 全局状态。每层输出的 feature map 都将被添加到块的这个全局状态中，该全局状态可以理解为网络块的 集体知识，由块内所有层共享, 而增长率 kk 决定了新增特征占全局状态的比例

因此 feature map 无需逐层复制（因为它是全局共享），这也是 DenseNet 与传统网络结构不同的地方。这有助于整个网络的特征重用，并产生更紧凑的模型。

2.2.1.2 非线性变换 H_l()Hl()

H_l()Hl() 可以是包含了 Batch Normalization(BN) 、ReLU 单元、池化或者卷积等操作的复合函数
H_l()Hl() 的结构为先执行BN，再执行ReLU，最后接一个 3\times33×3 的卷积，即 BN-ReLU-Conv(3x3)

2.2.1.3 BottleNeck

DenseNet 块中每层只产生 kk 个输出 feature map，但是它具有很多输入。当在 H_l()Hl() 之前采用 1\times11×1 卷积实现降维时，可以减小计算量。

H_l()Hl() 的输入是由第 0,1,2,\dots,l-10,1,2,…,l−1 层的输出 feature map 组成，其中第 0 层的输出feature map就是整个DensNet 块的输入feature map

事实上第 1,2,\dots,l-11,2,…,l−1 层从 DensNet 块的输入 feature map 中抽取各种特征。即 H_l()Hl() 包含了 DensNet 块的输入 feature map 的冗余信息，这可以通过 1times11times1 卷积降维来去掉这种冗余性。因此这种 1\times11×1 卷积降维对于 DenseNet 块极其有效

如果在 H_l()Hl() 中引入 1\times11×1 卷积降维，则该版本的 DenseNet 称作 DenseNet-B。其 H_l()Hl() 结构为先执行 BN ，再执行 ReLU，再接一个 1\times11×1 的卷积，再执行 BN，再执行 ReLU，最后接一个 3\times33×3 的卷积, 即 BN-ReLU-Conv(1x1)-BN-ReLU-Conv(3x3)

其中1x1 卷积的输出通道数是个超参数，选取为 4\times k4×k

2.2.2 DenseNet 过渡层

一个 DenseNet 网络具有多个 DenseNet 块，DenseNet 块之间由过渡层连接。DenseNet 块之间的层称为 过渡层，其主要作用是连接不同的 DenseNet 块。

过渡层可以包含卷积或池化操作，从而改变前一个 DenseNet 块的输出 feature map 的大小(包括尺寸大小、通道数量)。过渡层由一个 BN 层、一个 1\times11×1 卷积层、一个 2\times22×2 平均池化层组成。其中 1\times11×1 卷积层用于减少 DenseNet 块的输出通道数，提高模型的紧凑性

如果不减少 DenseNet 块的输出通道数，则经过了 NN 个DenseNet 块之后，网络的 feature map 的通道数为：K_0 + N \times k \times (L - 1)K0+N×k×(L−1)，其中 k_0k0 为输入图片的通道数，LL 为每个DenseNet 块的层数

如果 Dense 块输出 feature map的通道数为 mm，则可以使得过渡层输出 feature map 的通道数为 \theta mθm，其中 \thetaθ 为压缩因子

当 \theta = 1θ=1 时，经过过渡层的feature map 通道数不变
当 \theta < 1θ<1 时，经过过渡层的feature map 通道数减小。此时的DenseNet 称做 DenseNet-C。结合了DenseNet-C 和 DenseNet-B 的改进的网络称作 DenseNet-BC

2.2.3 DenseNet 模型网络

ImageNet 训练的 DenseNet 网络结构，其中增长率 k=32k=32

conv 代表的是 BN-ReLU-Conv 的组合, 1\times11×1 conv 表示先执行 BN，再执行 ReLU，最后执行 1\times11×1 的卷积
DenseNet-xx 表示 DenseNet 块有 xx 层, DenseNet-169 表示 DenseNet 块有 L=169L=169 层
所有 DenseNet 使用的是 DenseNet-BC 结构，输入图片尺寸为 224\times224224×224，初始卷积尺寸为 7\times77×7、输出通道 2k、步长为 2 ，压缩因子 \theta = 0.5θ=0.5
所有 DenseNet 块的最后接一个全局平均池化层，该池化层的结果作为 softmax 输出层的输入

2.3 DenseNet PyTorch

# %load dense.py import math import torch import torch.nn as nn import torch.nn.functional as F from collections import OrderedDict class DenseLayer(nn.Module): expansion = 4 def __init__(self, in_channels, growth_rate): super(DenseLayer, self).__init__() zip_channels = self.expansion * growth_rate self.features = nn.Sequential( nn.BatchNorm2d(in_channels), nn.ReLU(inplace=True), nn.Conv2d(in_channels, zip_channels, kernel_size=1, bias=False), nn.BatchNorm2d(zip_channels), nn.ReLU(inplace=True), nn.Conv2d(zip_channels, growth_rate, kernel_size=3, padding=1, bias=False) ) def forward(self, x): out = self.features(x) out = torch.cat([out, x], 1) return out class Transition(nn.Sequential): def __init__(self, num_input_features, num_output_features): super(Transition, self).__init__() self.features = nn.Sequential( nn.BatchNorm2d(num_input_features), nn.ReLU(inplace=True), nn.Conv2d(num_input_features, num_output_features,kernel_size=1, bias=False), nn.AvgPool2d(kernel_size=2, stride=2) ) def forward(self, x): out = self.features(x) return out class DenseNet121(nn.Module): def __init__(self,growth_rate = 32,reduction = 0.5,num_classes=1000): super(DenseNet121, self).__init__() self.growth_rate = growth_rate self.reduction = reduction self.features = nn.Sequential(OrderedDict([ ('conv0', nn.Conv2d(3, 64, kernel_size=7, stride=2,padding=3, bias=False)), ('norm0', nn.BatchNorm2d(64)), ('relu0', nn.ReLU(inplace=True)), ('pool0', nn.MaxPool2d(kernel_size=3, stride=2, padding=1)), ])) num_features = 2 * growth_rate num_blocks = (6, 12, 24, 16) self.layer1, num_features = self._make_dense_layer(num_features, num_blocks[0]) self.layer2, num_features = self._make_dense_layer(num_features, num_blocks[1]) self.layer3, num_features = self._make_dense_layer(num_features, num_blocks[2]) self.layer4, num_features = self._make_dense_layer(num_features, num_blocks[3], transition=False) self.bn = nn.BatchNorm2d(num_features) self.classifier = nn.Linear(num_features,num_classes) def _make_dense_layer(self, in_channels, nblock, transition=True): layers = [] for i in range(nblock): layers += [DenseLayer(in_channels, self.growth_rate)] in_channels += self.growth_rate out_channels = in_channels if transition: out_channels = int(math.floor(in_channels * self.reduction)) layers += [Transition(in_channels, out_channels)] return nn.Sequential(*layers), out_channels def forward(self,x): out = self.features(x) out = self.layer1(out) out = self.layer2(out) out = self.layer3(out) out = self.layer4(out) out = self.bn(out) out = F.relu(out, inplace=True) out = F.adaptive_avg_pool2d(out, (1, 1)) out = out.view(out.size(0), -1) out = self.classifier(out) return out def build_dense121(phase,num_classes,pretrained): if phase != "test" and phase != "train": print("ERROR: Phase: " + phase + " not recognized") return if not pretrained: model = DenseNet121(num_classes=num_classes) else: model = DenseNet121() model_weights_path = 'weights/densenet121-a639ec97.pth' model.load_state_dict(torch.load(model_weights_path), strict=False) for parma in model.parameters(): parma.requires_grad = False ratio = int(math.sqrt(1024/num_classes)) floor = math.floor(math.log2(ratio)) hidden_size = int(math.pow(2,10-floor)) model.classifier = nn.Sequential(nn.Linear(1024, hidden_size), nn.ReLU(inplace=True), nn.Linear(hidden_size, num_classes)) return model

net = build_dense121('train',10,False) net.cuda() summary(net,(3,224,224))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1         [-1, 64, 112, 112]           9,408
       BatchNorm2d-2         [-1, 64, 112, 112]             128
              ReLU-3         [-1, 64, 112, 112]               0
         MaxPool2d-4           [-1, 64, 56, 56]               0
       BatchNorm2d-5           [-1, 64, 56, 56]             128
              ReLU-6           [-1, 64, 56, 56]               0
            Conv2d-7          [-1, 128, 56, 56]           8,192
       BatchNorm2d-8          [-1, 128, 56, 56]             256
              ReLU-9          [-1, 128, 56, 56]               0
           Conv2d-10           [-1, 32, 56, 56]          36,864
       DenseLayer-11           [-1, 96, 56, 56]               0
      BatchNorm2d-12           [-1, 96, 56, 56]             192
             ReLU-13           [-1, 96, 56, 56]               0
           Conv2d-14          [-1, 128, 56, 56]          12,288
      BatchNorm2d-15          [-1, 128, 56, 56]             256
             ReLU-16          [-1, 128, 56, 56]               0
           Conv2d-17           [-1, 32, 56, 56]          36,864
       DenseLayer-18          [-1, 128, 56, 56]               0
      BatchNorm2d-19          [-1, 128, 56, 56]             256
             ReLU-20          [-1, 128, 56, 56]               0
           Conv2d-21          [-1, 128, 56, 56]          16,384
      BatchNorm2d-22          [-1, 128, 56, 56]             256
             ReLU-23          [-1, 128, 56, 56]               0
           Conv2d-24           [-1, 32, 56, 56]          36,864
       DenseLayer-25          [-1, 160, 56, 56]               0
      BatchNorm2d-26          [-1, 160, 56, 56]             320
             ReLU-27          [-1, 160, 56, 56]               0
           Conv2d-28          [-1, 128, 56, 56]          20,480
      BatchNorm2d-29          [-1, 128, 56, 56]             256
             ReLU-30          [-1, 128, 56, 56]               0
           Conv2d-31           [-1, 32, 56, 56]          36,864
       DenseLayer-32          [-1, 192, 56, 56]               0
      BatchNorm2d-33          [-1, 192, 56, 56]             384
             ReLU-34          [-1, 192, 56, 56]               0
           Conv2d-35          [-1, 128, 56, 56]          24,576
      BatchNorm2d-36          [-1, 128, 56, 56]             256
             ReLU-37          [-1, 128, 56, 56]               0
           Conv2d-38           [-1, 32, 56, 56]          36,864
       DenseLayer-39          [-1, 224, 56, 56]               0
      BatchNorm2d-40          [-1, 224, 56, 56]             448
             ReLU-41          [-1, 224, 56, 56]               0
           Conv2d-42          [-1, 128, 56, 56]          28,672
      BatchNorm2d-43          [-1, 128, 56, 56]             256
             ReLU-44          [-1, 128, 56, 56]               0
           Conv2d-45           [-1, 32, 56, 56]          36,864
       DenseLayer-46          [-1, 256, 56, 56]               0
      BatchNorm2d-47          [-1, 256, 56, 56]             512
             ReLU-48          [-1, 256, 56, 56]               0
           Conv2d-49          [-1, 128, 56, 56]          32,768
        AvgPool2d-50          [-1, 128, 28, 28]               0
      BatchNorm2d-51          [-1, 128, 28, 28]             256
             ReLU-52          [-1, 128, 28, 28]               0
           Conv2d-53          [-1, 128, 28, 28]          16,384
      BatchNorm2d-54          [-1, 128, 28, 28]             256
             ReLU-55          [-1, 128, 28, 28]               0
           Conv2d-56           [-1, 32, 28, 28]          36,864
       DenseLayer-57          [-1, 160, 28, 28]               0
      BatchNorm2d-58          [-1, 160, 28, 28]             320
             ReLU-59          [-1, 160, 28, 28]               0
           Conv2d-60          [-1, 128, 28, 28]          20,480
      BatchNorm2d-61          [-1, 128, 28, 28]             256
             ReLU-62          [-1, 128, 28, 28]               0
           Conv2d-63           [-1, 32, 28, 28]          36,864
       DenseLayer-64          [-1, 192, 28, 28]               0
      BatchNorm2d-65          [-1, 192, 28, 28]             384
             ReLU-66          [-1, 192, 28, 28]               0
           Conv2d-67          [-1, 128, 28, 28]          24,576
      BatchNorm2d-68          [-1, 128, 28, 28]             256
             ReLU-69          [-1, 128, 28, 28]               0
           Conv2d-70           [-1, 32, 28, 28]          36,864
       DenseLayer-71          [-1, 224, 28, 28]               0
      BatchNorm2d-72          [-1, 224, 28, 28]             448
             ReLU-73          [-1, 224, 28, 28]               0
           Conv2d-74          [-1, 128, 28, 28]          28,672
      BatchNorm2d-75          [-1, 128, 28, 28]             256
             ReLU-76          [-1, 128, 28, 28]               0
           Conv2d-77           [-1, 32, 28, 28]          36,864
       DenseLayer-78          [-1, 256, 28, 28]               0
      BatchNorm2d-79          [-1, 256, 28, 28]             512
             ReLU-80          [-1, 256, 28, 28]               0
           Conv2d-81          [-1, 128, 28, 28]          32,768
      BatchNorm2d-82          [-1, 128, 28, 28]             256
             ReLU-83          [-1, 128, 28, 28]               0
           Conv2d-84           [-1, 32, 28, 28]          36,864
       DenseLayer-85          [-1, 288, 28, 28]               0
      BatchNorm2d-86          [-1, 288, 28, 28]             576
             ReLU-87          [-1, 288, 28, 28]               0
           Conv2d-88          [-1, 128, 28, 28]          36,864
      BatchNorm2d-89          [-1, 128, 28, 28]             256
             ReLU-90          [-1, 128, 28, 28]               0
           Conv2d-91           [-1, 32, 28, 28]          36,864
       DenseLayer-92          [-1, 320, 28, 28]               0
      BatchNorm2d-93          [-1, 320, 28, 28]             640
             ReLU-94          [-1, 320, 28, 28]               0
           Conv2d-95          [-1, 128, 28, 28]          40,960
      BatchNorm2d-96          [-1, 128, 28, 28]             256
             ReLU-97          [-1, 128, 28, 28]               0
           Conv2d-98           [-1, 32, 28, 28]          36,864
       DenseLayer-99          [-1, 352, 28, 28]               0
     BatchNorm2d-100          [-1, 352, 28, 28]             704
            ReLU-101          [-1, 352, 28, 28]               0
          Conv2d-102          [-1, 128, 28, 28]          45,056
     BatchNorm2d-103          [-1, 128, 28, 28]             256
            ReLU-104          [-1, 128, 28, 28]               0
          Conv2d-105           [-1, 32, 28, 28]          36,864
      DenseLayer-106          [-1, 384, 28, 28]               0
     BatchNorm2d-107          [-1, 384, 28, 28]             768
            ReLU-108          [-1, 384, 28, 28]               0
          Conv2d-109          [-1, 128, 28, 28]          49,152
     BatchNorm2d-110          [-1, 128, 28, 28]             256
            ReLU-111          [-1, 128, 28, 28]               0
          Conv2d-112           [-1, 32, 28, 28]          36,864
      DenseLayer-113          [-1, 416, 28, 28]               0
     BatchNorm2d-114          [-1, 416, 28, 28]             832
            ReLU-115          [-1, 416, 28, 28]               0
          Conv2d-116          [-1, 128, 28, 28]          53,248
     BatchNorm2d-117          [-1, 128, 28, 28]             256
            ReLU-118          [-1, 128, 28, 28]               0
          Conv2d-119           [-1, 32, 28, 28]          36,864
      DenseLayer-120          [-1, 448, 28, 28]               0
     BatchNorm2d-121          [-1, 448, 28, 28]             896
            ReLU-122          [-1, 448, 28, 28]               0
          Conv2d-123          [-1, 128, 28, 28]          57,344
     BatchNorm2d-124          [-1, 128, 28, 28]             256
            ReLU-125          [-1, 128, 28, 28]               0
          Conv2d-126           [-1, 32, 28, 28]          36,864
      DenseLayer-127          [-1, 480, 28, 28]               0
     BatchNorm2d-128          [-1, 480, 28, 28]             960
            ReLU-129          [-1, 480, 28, 28]               0
          Conv2d-130          [-1, 128, 28, 28]          61,440
     BatchNorm2d-131          [-1, 128, 28, 28]             256
            ReLU-132          [-1, 128, 28, 28]               0
          Conv2d-133           [-1, 32, 28, 28]          36,864
      DenseLayer-134          [-1, 512, 28, 28]               0
     BatchNorm2d-135          [-1, 512, 28, 28]           1,024
            ReLU-136          [-1, 512, 28, 28]               0
          Conv2d-137          [-1, 256, 28, 28]         131,072
       AvgPool2d-138          [-1, 256, 14, 14]               0
     BatchNorm2d-139          [-1, 256, 14, 14]             512
            ReLU-140          [-1, 256, 14, 14]               0
          Conv2d-141          [-1, 128, 14, 14]          32,768
     BatchNorm2d-142          [-1, 128, 14, 14]             256
            ReLU-143          [-1, 128, 14, 14]               0
          Conv2d-144           [-1, 32, 14, 14]          36,864
      DenseLayer-145          [-1, 288, 14, 14]               0
     BatchNorm2d-146          [-1, 288, 14, 14]             576
            ReLU-147          [-1, 288, 14, 14]               0
          Conv2d-148          [-1, 128, 14, 14]          36,864
     BatchNorm2d-149          [-1, 128, 14, 14]             256
            ReLU-150          [-1, 128, 14, 14]               0
          Conv2d-151           [-1, 32, 14, 14]          36,864
      DenseLayer-152          [-1, 320, 14, 14]               0
     BatchNorm2d-153          [-1, 320, 14, 14]             640
            ReLU-154          [-1, 320, 14, 14]               0
          Conv2d-155          [-1, 128, 14, 14]          40,960
     BatchNorm2d-156          [-1, 128, 14, 14]             256
            ReLU-157          [-1, 128, 14, 14]               0
          Conv2d-158           [-1, 32, 14, 14]          36,864
      DenseLayer-159          [-1, 352, 14, 14]               0
     BatchNorm2d-160          [-1, 352, 14, 14]             704
            ReLU-161          [-1, 352, 14, 14]               0
          Conv2d-162          [-1, 128, 14, 14]          45,056
     BatchNorm2d-163          [-1, 128, 14, 14]             256
            ReLU-164          [-1, 128, 14, 14]               0
          Conv2d-165           [-1, 32, 14, 14]          36,864
      DenseLayer-166          [-1, 384, 14, 14]               0
     BatchNorm2d-167          [-1, 384, 14, 14]             768
            ReLU-168          [-1, 384, 14, 14]               0
          Conv2d-169          [-1, 128, 14, 14]          49,152
     BatchNorm2d-170          [-1, 128, 14, 14]             256
            ReLU-171          [-1, 128, 14, 14]               0
          Conv2d-172           [-1, 32, 14, 14]          36,864
      DenseLayer-173          [-1, 416, 14, 14]               0
     BatchNorm2d-174          [-1, 416, 14, 14]             832
            ReLU-175          [-1, 416, 14, 14]               0
          Conv2d-176          [-1, 128, 14, 14]          53,248
     BatchNorm2d-177          [-1, 128, 14, 14]             256
            ReLU-178          [-1, 128, 14, 14]               0
          Conv2d-179           [-1, 32, 14, 14]          36,864
      DenseLayer-180          [-1, 448, 14, 14]               0
     BatchNorm2d-181          [-1, 448, 14, 14]             896
            ReLU-182          [-1, 448, 14, 14]               0
          Conv2d-183          [-1, 128, 14, 14]          57,344
     BatchNorm2d-184          [-1, 128, 14, 14]             256
            ReLU-185          [-1, 128, 14, 14]               0
          Conv2d-186           [-1, 32, 14, 14]          36,864
      DenseLayer-187          [-1, 480, 14, 14]               0
     BatchNorm2d-188          [-1, 480, 14, 14]             960
            ReLU-189          [-1, 480, 14, 14]               0
          Conv2d-190          [-1, 128, 14, 14]          61,440
     BatchNorm2d-191          [-1, 128, 14, 14]             256
            ReLU-192          [-1, 128, 14, 14]               0
          Conv2d-193           [-1, 32, 14, 14]          36,864
      DenseLayer-194          [-1, 512, 14, 14]               0
     BatchNorm2d-195          [-1, 512, 14, 14]           1,024
            ReLU-196          [-1, 512, 14, 14]               0
          Conv2d-197          [-1, 128, 14, 14]          65,536
     BatchNorm2d-198          [-1, 128, 14, 14]             256
            ReLU-199          [-1, 128, 14, 14]               0
          Conv2d-200           [-1, 32, 14, 14]          36,864
      DenseLayer-201          [-1, 544, 14, 14]               0
     BatchNorm2d-202          [-1, 544, 14, 14]           1,088
            ReLU-203          [-1, 544, 14, 14]               0
          Conv2d-204          [-1, 128, 14, 14]          69,632
     BatchNorm2d-205          [-1, 128, 14, 14]             256
            ReLU-206          [-1, 128, 14, 14]               0
          Conv2d-207           [-1, 32, 14, 14]          36,864
      DenseLayer-208          [-1, 576, 14, 14]               0
     BatchNorm2d-209          [-1, 576, 14, 14]           1,152
            ReLU-210          [-1, 576, 14, 14]               0
          Conv2d-211          [-1, 128, 14, 14]          73,728
     BatchNorm2d-212          [-1, 128, 14, 14]             256
            ReLU-213          [-1, 128, 14, 14]               0
          Conv2d-214           [-1, 32, 14, 14]          36,864
      DenseLayer-215          [-1, 608, 14, 14]               0
     BatchNorm2d-216          [-1, 608, 14, 14]           1,216
            ReLU-217          [-1, 608, 14, 14]               0
          Conv2d-218          [-1, 128, 14, 14]          77,824
     BatchNorm2d-219          [-1, 128, 14, 14]             256
            ReLU-220          [-1, 128, 14, 14]               0
          Conv2d-221           [-1, 32, 14, 14]          36,864
      DenseLayer-222          [-1, 640, 14, 14]               0
     BatchNorm2d-223          [-1, 640, 14, 14]           1,280
            ReLU-224          [-1, 640, 14, 14]               0
          Conv2d-225          [-1, 128, 14, 14]          81,920
     BatchNorm2d-226          [-1, 128, 14, 14]             256
            ReLU-227          [-1, 128, 14, 14]               0
          Conv2d-228           [-1, 32, 14, 14]          36,864
      DenseLayer-229          [-1, 672, 14, 14]               0
     BatchNorm2d-230          [-1, 672, 14, 14]           1,344
            ReLU-231          [-1, 672, 14, 14]               0
          Conv2d-232          [-1, 128, 14, 14]          86,016
     BatchNorm2d-233          [-1, 128, 14, 14]             256
            ReLU-234          [-1, 128, 14, 14]               0
          Conv2d-235           [-1, 32, 14, 14]          36,864
      DenseLayer-236          [-1, 704, 14, 14]               0
     BatchNorm2d-237          [-1, 704, 14, 14]           1,408
            ReLU-238          [-1, 704, 14, 14]               0
          Conv2d-239          [-1, 128, 14, 14]          90,112
     BatchNorm2d-240          [-1, 128, 14, 14]             256
            ReLU-241          [-1, 128, 14, 14]               0
          Conv2d-242           [-1, 32, 14, 14]          36,864
      DenseLayer-243          [-1, 736, 14, 14]               0
     BatchNorm2d-244          [-1, 736, 14, 14]           1,472
            ReLU-245          [-1, 736, 14, 14]               0
          Conv2d-246          [-1, 128, 14, 14]          94,208
     BatchNorm2d-247          [-1, 128, 14, 14]             256
            ReLU-248          [-1, 128, 14, 14]               0
          Conv2d-249           [-1, 32, 14, 14]          36,864
      DenseLayer-250          [-1, 768, 14, 14]               0
     BatchNorm2d-251          [-1, 768, 14, 14]           1,536
            ReLU-252          [-1, 768, 14, 14]               0
          Conv2d-253          [-1, 128, 14, 14]          98,304
     BatchNorm2d-254          [-1, 128, 14, 14]             256
            ReLU-255          [-1, 128, 14, 14]               0
          Conv2d-256           [-1, 32, 14, 14]          36,864
      DenseLayer-257          [-1, 800, 14, 14]               0
     BatchNorm2d-258          [-1, 800, 14, 14]           1,600
            ReLU-259          [-1, 800, 14, 14]               0
          Conv2d-260          [-1, 128, 14, 14]         102,400
     BatchNorm2d-261          [-1, 128, 14, 14]             256
            ReLU-262          [-1, 128, 14, 14]               0
          Conv2d-263           [-1, 32, 14, 14]          36,864
      DenseLayer-264          [-1, 832, 14, 14]               0
     BatchNorm2d-265          [-1, 832, 14, 14]           1,664
            ReLU-266          [-1, 832, 14, 14]               0
          Conv2d-267          [-1, 128, 14, 14]         106,496
     BatchNorm2d-268          [-1, 128, 14, 14]             256
            ReLU-269          [-1, 128, 14, 14]               0
          Conv2d-270           [-1, 32, 14, 14]          36,864
      DenseLayer-271          [-1, 864, 14, 14]               0
     BatchNorm2d-272          [-1, 864, 14, 14]           1,728
            ReLU-273          [-1, 864, 14, 14]               0
          Conv2d-274          [-1, 128, 14, 14]         110,592
     BatchNorm2d-275          [-1, 128, 14, 14]             256
            ReLU-276          [-1, 128, 14, 14]               0
          Conv2d-277           [-1, 32, 14, 14]          36,864
      DenseLayer-278          [-1, 896, 14, 14]               0
     BatchNorm2d-279          [-1, 896, 14, 14]           1,792
            ReLU-280          [-1, 896, 14, 14]               0
          Conv2d-281          [-1, 128, 14, 14]         114,688
     BatchNorm2d-282          [-1, 128, 14, 14]             256
            ReLU-283          [-1, 128, 14, 14]               0
          Conv2d-284           [-1, 32, 14, 14]          36,864
      DenseLayer-285          [-1, 928, 14, 14]               0
     BatchNorm2d-286          [-1, 928, 14, 14]           1,856
            ReLU-287          [-1, 928, 14, 14]               0
          Conv2d-288          [-1, 128, 14, 14]         118,784
     BatchNorm2d-289          [-1, 128, 14, 14]             256
            ReLU-290          [-1, 128, 14, 14]               0
          Conv2d-291           [-1, 32, 14, 14]          36,864
      DenseLayer-292          [-1, 960, 14, 14]               0
     BatchNorm2d-293          [-1, 960, 14, 14]           1,920
            ReLU-294          [-1, 960, 14, 14]               0
          Conv2d-295          [-1, 128, 14, 14]         122,880
     BatchNorm2d-296          [-1, 128, 14, 14]             256
            ReLU-297          [-1, 128, 14, 14]               0
          Conv2d-298           [-1, 32, 14, 14]          36,864
      DenseLayer-299          [-1, 992, 14, 14]               0
     BatchNorm2d-300          [-1, 992, 14, 14]           1,984
            ReLU-301          [-1, 992, 14, 14]               0
          Conv2d-302          [-1, 128, 14, 14]         126,976
     BatchNorm2d-303          [-1, 128, 14, 14]             256
            ReLU-304          [-1, 128, 14, 14]               0
          Conv2d-305           [-1, 32, 14, 14]          36,864
      DenseLayer-306         [-1, 1024, 14, 14]               0
     BatchNorm2d-307         [-1, 1024, 14, 14]           2,048
            ReLU-308         [-1, 1024, 14, 14]               0
          Conv2d-309          [-1, 512, 14, 14]         524,288
       AvgPool2d-310            [-1, 512, 7, 7]               0
     BatchNorm2d-311            [-1, 512, 7, 7]           1,024
            ReLU-312            [-1, 512, 7, 7]               0
          Conv2d-313            [-1, 128, 7, 7]          65,536
     BatchNorm2d-314            [-1, 128, 7, 7]             256
            ReLU-315            [-1, 128, 7, 7]               0
          Conv2d-316             [-1, 32, 7, 7]          36,864
      DenseLayer-317            [-1, 544, 7, 7]               0
     BatchNorm2d-318            [-1, 544, 7, 7]           1,088
            ReLU-319            [-1, 544, 7, 7]               0
          Conv2d-320            [-1, 128, 7, 7]          69,632
     BatchNorm2d-321            [-1, 128, 7, 7]             256
            ReLU-322            [-1, 128, 7, 7]               0
          Conv2d-323             [-1, 32, 7, 7]          36,864
      DenseLayer-324            [-1, 576, 7, 7]               0
     BatchNorm2d-325            [-1, 576, 7, 7]           1,152
            ReLU-326            [-1, 576, 7, 7]               0
          Conv2d-327            [-1, 128, 7, 7]          73,728
     BatchNorm2d-328            [-1, 128, 7, 7]             256
            ReLU-329            [-1, 128, 7, 7]               0
          Conv2d-330             [-1, 32, 7, 7]          36,864
      DenseLayer-331            [-1, 608, 7, 7]               0
     BatchNorm2d-332            [-1, 608, 7, 7]           1,216
            ReLU-333            [-1, 608, 7, 7]               0
          Conv2d-334            [-1, 128, 7, 7]          77,824
     BatchNorm2d-335            [-1, 128, 7, 7]             256
            ReLU-336            [-1, 128, 7, 7]               0
          Conv2d-337             [-1, 32, 7, 7]          36,864
      DenseLayer-338            [-1, 640, 7, 7]               0
     BatchNorm2d-339            [-1, 640, 7, 7]           1,280
            ReLU-340            [-1, 640, 7, 7]               0
          Conv2d-341            [-1, 128, 7, 7]          81,920
     BatchNorm2d-342            [-1, 128, 7, 7]             256
            ReLU-343            [-1, 128, 7, 7]               0
          Conv2d-344             [-1, 32, 7, 7]          36,864
      DenseLayer-345            [-1, 672, 7, 7]               0
     BatchNorm2d-346            [-1, 672, 7, 7]           1,344
            ReLU-347            [-1, 672, 7, 7]               0
          Conv2d-348            [-1, 128, 7, 7]          86,016
     BatchNorm2d-349            [-1, 128, 7, 7]             256
            ReLU-350            [-1, 128, 7, 7]               0
          Conv2d-351             [-1, 32, 7, 7]          36,864
      DenseLayer-352            [-1, 704, 7, 7]               0
     BatchNorm2d-353            [-1, 704, 7, 7]           1,408
            ReLU-354            [-1, 704, 7, 7]               0
          Conv2d-355            [-1, 128, 7, 7]          90,112
     BatchNorm2d-356            [-1, 128, 7, 7]             256
            ReLU-357            [-1, 128, 7, 7]               0
          Conv2d-358             [-1, 32, 7, 7]          36,864
      DenseLayer-359            [-1, 736, 7, 7]               0
     BatchNorm2d-360            [-1, 736, 7, 7]           1,472
            ReLU-361            [-1, 736, 7, 7]               0
          Conv2d-362            [-1, 128, 7, 7]          94,208
     BatchNorm2d-363            [-1, 128, 7, 7]             256
            ReLU-364            [-1, 128, 7, 7]               0
          Conv2d-365             [-1, 32, 7, 7]          36,864
      DenseLayer-366            [-1, 768, 7, 7]               0
     BatchNorm2d-367            [-1, 768, 7, 7]           1,536
            ReLU-368            [-1, 768, 7, 7]               0
          Conv2d-369            [-1, 128, 7, 7]          98,304
     BatchNorm2d-370            [-1, 128, 7, 7]             256
            ReLU-371            [-1, 128, 7, 7]               0
          Conv2d-372             [-1, 32, 7, 7]          36,864
      DenseLayer-373            [-1, 800, 7, 7]               0
     BatchNorm2d-374            [-1, 800, 7, 7]           1,600
            ReLU-375            [-1, 800, 7, 7]               0
          Conv2d-376            [-1, 128, 7, 7]         102,400
     BatchNorm2d-377            [-1, 128, 7, 7]             256
            ReLU-378            [-1, 128, 7, 7]               0
          Conv2d-379             [-1, 32, 7, 7]          36,864
      DenseLayer-380            [-1, 832, 7, 7]               0
     BatchNorm2d-381            [-1, 832, 7, 7]           1,664
            ReLU-382            [-1, 832, 7, 7]               0
          Conv2d-383            [-1, 128, 7, 7]         106,496
     BatchNorm2d-384            [-1, 128, 7, 7]             256
            ReLU-385            [-1, 128, 7, 7]               0
          Conv2d-386             [-1, 32, 7, 7]          36,864
      DenseLayer-387            [-1, 864, 7, 7]               0
     BatchNorm2d-388            [-1, 864, 7, 7]           1,728
            ReLU-389            [-1, 864, 7, 7]               0
          Conv2d-390            [-1, 128, 7, 7]         110,592
     BatchNorm2d-391            [-1, 128, 7, 7]             256
            ReLU-392            [-1, 128, 7, 7]               0
          Conv2d-393             [-1, 32, 7, 7]          36,864
      DenseLayer-394            [-1, 896, 7, 7]               0
     BatchNorm2d-395            [-1, 896, 7, 7]           1,792
            ReLU-396            [-1, 896, 7, 7]               0
          Conv2d-397            [-1, 128, 7, 7]         114,688
     BatchNorm2d-398            [-1, 128, 7, 7]             256
            ReLU-399            [-1, 128, 7, 7]               0
          Conv2d-400             [-1, 32, 7, 7]          36,864
      DenseLayer-401            [-1, 928, 7, 7]               0
     BatchNorm2d-402            [-1, 928, 7, 7]           1,856
            ReLU-403            [-1, 928, 7, 7]               0
          Conv2d-404            [-1, 128, 7, 7]         118,784
     BatchNorm2d-405            [-1, 128, 7, 7]             256
            ReLU-406            [-1, 128, 7, 7]               0
          Conv2d-407             [-1, 32, 7, 7]          36,864
      DenseLayer-408            [-1, 960, 7, 7]               0
     BatchNorm2d-409            [-1, 960, 7, 7]           1,920
            ReLU-410            [-1, 960, 7, 7]               0
          Conv2d-411            [-1, 128, 7, 7]         122,880
     BatchNorm2d-412            [-1, 128, 7, 7]             256
            ReLU-413            [-1, 128, 7, 7]               0
          Conv2d-414             [-1, 32, 7, 7]          36,864
      DenseLayer-415            [-1, 992, 7, 7]               0
     BatchNorm2d-416            [-1, 992, 7, 7]           1,984
            ReLU-417            [-1, 992, 7, 7]               0
          Conv2d-418            [-1, 128, 7, 7]         126,976
     BatchNorm2d-419            [-1, 128, 7, 7]             256
            ReLU-420            [-1, 128, 7, 7]               0
          Conv2d-421             [-1, 32, 7, 7]          36,864
      DenseLayer-422           [-1, 1024, 7, 7]               0
     BatchNorm2d-423           [-1, 1024, 7, 7]           2,048
          Linear-424                   [-1, 10]          10,250
================================================================
Total params: 6,964,106
Trainable params: 6,964,106
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.57
Forward/backward pass size (MB): 371.81
Params size (MB): 26.57
Estimated Total Size (MB): 398.95
----------------------------------------------------------------

2.4 DenseNet 设计技巧

2.4.1 内存消耗

虽然 DenseNet 的计算效率较高、参数相对较少，但是 DenseNet 对内存 不友好 。考虑到 GPU 显存大小的限制，因此无法训练较深的 DenseNet。假设DenseNet块包含 LL 层，对于第 ll 层有 x_l = H_l([x_0,x_1,\dots,x_{l-1}])xl=Hl([x0,x1,…,xl−1])

假设每层的输出 feature map 尺寸均为 W \times HW×H、通道数为 kk，H_lHl 由 BN-ReLU-Conv(3x3) 组成，则

拼接 Concat 操作 [\dots][…] ：需要生成临时 feature map 作为第 ll 层的输入，内存消耗为 WHK \times lWHK×l
BN 操作需要生成临时 feature map 作为 ReLU 的输入，内存消耗为 WHK \times lWHK×l
ReLU 操作可以执行原地修改，因此不需要额外的 feature map 存放 ReLU 的输出
Conv 操作需要生成输出 feature map 作为第 ll 层的输出，它是必须的开销

因此除了第 1,2,\dots,L1,2,…,L 层的输出 feature map 需要内存开销之外，第 ll 层还需要 2WHKl2WHKl 的内存开销来存放中间生成的临时 feature map。整个 DenseNet Block 需要 WHk(1 + L)LWHk(1+L)L 的内存开销来存放中间生成的临时 feature map, 即 DenseNet Block 的内存消耗为 O(L^2)O(L2)，是网络深度的平方关系

拼接 Concat 操作是必须的，因为当卷积的输入存放在连续的内存区域时，卷积操作的计算效率较高。而 DenseNet Block 中，第 ll 层的输入 feature map 由前面各层的输出 feature map 沿通道方向拼接而成。而这些输出 feature map 并不在连续的内存区域

另外，拼接 feature map 并不是简单的将它们拷贝在一起。由于 feature map 在 Tensorflow/PyTorch 实现中的表示为 \Re^{n \times d \times w \times h}ℜn×d×w×h (channel first)，或者 \Re^{n \times w \times h \times d}ℜn×w×h×d (channel last)，如果简单的将它们拷贝在一起则是沿着 mini batch 维度的拼接，而不是沿着通道方向的拼接, DenseNet Block 的这种内存消耗并不是 DenseNet Block 的结构引起的，而是由 深度学习库 引起的。因为 Tensorflow/PyTorch 等库在实现神经网络时，会存放中间生成的临时节点 (如BN 的输出节点)，这是为了在反向传播阶段可以直接获取临时节点的值

这是在时间代价和空间代价之间的折中：通过开辟更多的空间来存储临时值，从而在反向传播阶段节省计算。

除了临时 feature map 的内存消耗之外，网络的参数也会消耗内存。设 H_lHl 由BN-ReLU-Conv(3x3) 组成，则第 ll 层的网络参数数量为 9Lk^29Lk2 (不考虑 BN ), 整个 DenseNet Block 的参数数量为 \frac{9k^2(L+1)L}{2}29k2(L+1)L ，即 O(L)O(L), 因此网络参数的数量也是网络深度的 平方关系

由于DenseNet 参数数量与网络的深度呈平方关系，因此 DenseNet 网络的参数更多、网络容量更大。这也是 DenseNet 优于其它网络的一个重要因素
通常情况下都有 WH > \frac{9k}{2}WH>29k ，其中 W,HW,H 为网络 feature map 的宽、高，kk 为网络的增长率。所以网络参数消耗的内存要远小于临时 feature map 消耗的内存

2.4.2 内存优化

传统的 DenseNet Block 实现与内存优化的 DenseNet Block 对比如下（第 ll 层，该层的输入 feature map 来自于同一个块中早前的层的输出）

左图为传统的 DenseNet Block 的第 ll 层, 首先将 feature map 拷贝到连续的内存块，拷贝时完成拼接的操作。然后依次执行 BN、ReLU、Conv 操作。该层的临时 feature map 需要消耗内存 2WHkl2WHkl，该层的输出 feature map 需要消耗内存 WHkWHk

另外某些实现（如LuaTorch）还需要为反向传播过程的梯度分配内存, 计算 BN 层输出的梯度时，需要用到第 ll 层输出层的梯度和 BN 层的输出。存储这些梯度需要额外的 O(lk)O(lk) 的内存
另外一些实现（如PyTorch,MxNet）会对梯度使用共享的内存区域来存放这些梯度，因此只需要 O(k)O(k) 的内存

右图为内存优化的 DenseNet Block 的第 ll 层, 采用两组预分配的共享内存区 Shared memory Storage location 来存 Concate 操作和 BN 操作输出的临时feature map

第3节 PyTorch 实践

3.1 模型训练代码

加载同级目录下 train.py 程序代码

# %load train.py import os os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE" import time import argparse import sys import torch import torch.nn as nn import torch.optim as optim import torch.backends.cudnn as cudnn from torchvision import datasets, transforms from torch.autograd import Variable import matplotlib as mpl import matplotlib.pyplot as plt mpl.rc('axes', labelsize = 14) mpl.rc('xtick', labelsize = 12) mpl.rc('ytick', labelsize = 12) sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))) from res import build_res50 from dense import build_dense121 from datasets.config import * from datasets.cifar import CIFAR10 from datasets.flower import shuffle_flower from datasets.oxford_iiit import shuffle_oxford def str2bool(v): return v.lower() in ("yes", "true", "t", "1") parser = argparse.ArgumentParser( description='Image Classification Training With Pytorch') train_set = parser.add_mutually_exclusive_group() parser.add_argument('--dataset', default='Flower', choices=['Flower', 'Oxford-IIIT', 'CIFAR-10'], type=str, help='Flower, Oxford-IIIT, CIFAR-10') parser.add_argument('--dataset_root', default=FLOWER_ROOT, help='Dataset root directory path') parser.add_argument('--model', default='ResNet', choices=['ResNet', 'DenseNet'], type=str, help='ResNet or DenseNet') parser.add_argument('--pretrained', default=True, type=str2bool, help='Using pretrained model weights') parser.add_argument('--crop_size', default=224, type=int, help='Resized crop value') parser.add_argument('--batch_size', default=32, type=int, help='Batch size for training') parser.add_argument('--num_workers', default=0, type=int, help='Number of workers used in dataloading') parser.add_argument('--epoch_size', default=20, type=int, help='Number of Epoches for training') parser.add_argument('--cuda', default=True, type=str2bool, help='Use CUDA to train model') parser.add_argument('--shuffle', default=False, type=str2bool, help='Shuffle new train and test folders') parser.add_argument('--lr', '--learning-rate', default=2e-4, type=float, help='initial learning rate') parser.add_argument('--save_folder', default='weights/', help='Directory for saving checkpoint models') parser.add_argument('--photo_folder', default='results/', help='Directory for saving photos') args = parser.parse_args() if not os.path.exists(args.save_folder): os.mkdir(args.save_folder) if not os.path.exists(args.photo_folder): os.mkdir(args.photo_folder) data_transform = transforms.Compose([transforms.RandomResizedCrop(args.crop_size), transforms.RandomHorizontalFlip(), transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]) def train(): if args.dataset == 'Flower': if not os.path.exists(FLOWER_ROOT): parser.error('Must specify dataset_root if specifying dataset') args.dataset_root = FLOWER_ROOT train_path = os.path.join(FLOWER_ROOT, 'train') if not os.path.exists(train_path) or args.shuffle: shuffle_flower() dataset = datasets.ImageFolder(root=train_path,transform=data_transform) if args.dataset == 'Oxford-IIIT': if not os.path.exists(OXFORD_IIIT_ROOT): parser.error('Must specify dataset_root if specifying dataset') args.dataset_root = OXFORD_IIIT_ROOT train_path = os.path.join(OXFORD_IIIT_ROOT, 'train') if not os.path.exists(train_path) or args.shuffle: shuffle_oxford() dataset = datasets.ImageFolder(root=train_path,transform=data_transform) if args.dataset == 'CIFAR-10': if not os.path.exists(CIFAR_ROOT): parser.error('Must specify dataset_root if specifying dataset') args.dataset_root = CIFAR_ROOT dataset = CIFAR10(train=True,transform=data_transform,target_transform=None) classes = dataset.classes if args.model == 'ResNet': net = build_res50(phase='train', num_classes=len(classes), pretrained=args.pretrained) if args.model == 'DenseNet': net = build_dense121(phase='train', num_classes=len(classes), pretrained=args.pretrained) if args.cuda and torch.cuda.is_available(): net = torch.nn.DataParallel(net) cudnn.benchmark = True net.cuda() optimizer = optim.Adam(net.parameters(), lr=args.lr) criterion = nn.CrossEntropyLoss() epoch_size = args.epoch_size print('Loading the dataset...') data_loader = torch.utils.data.DataLoader(dataset, args.batch_size, num_workers=args.num_workers, shuffle=True, pin_memory=True) print('Training on:', args.dataset) print('Using model:', args.model) print('Using the specified args:') print(args) loss_list = [] acc_list = [] for epoch in range(epoch_size): net.train() train_loss = 0.0 correct = 0 total = len(dataset) t0 = time.perf_counter() for step, data in enumerate(data_loader, start=0): images, labels = data if args.cuda: images = Variable(images.cuda()) labels = Variable(labels.cuda()) else: images = Variable(images) labels = Variable(labels) # forward outputs = net(images) # backprop optimizer.zero_grad() loss = criterion(outputs, labels) loss.backward() optimizer.step() # print statistics train_loss += loss.item() _, predicted = outputs.max(1) correct += predicted.eq(labels).sum().item() # print train process rate = (step + 1) / len(data_loader) a = "*" * int(rate * 50) b = "." * int((1 - rate) * 50) print("\rEpoch {}: {:^3.0f}%[{}->{}]{:.3f}".format(epoch+1, int(rate * 100), a, b, loss), end="") print(' Running time: %.3f' % (time.perf_counter() - t0)) acc = 100.*correct/ total loss = train_loss / step print('train loss: %.6f, acc: %.3f%% (%d/%d)' % (loss, acc, correct, total)) loss_list.append(loss) acc_list.append(acc/100) torch.save(net.state_dict(),args.save_folder + args.dataset + "_" + args.model + '.pth') plt.plot(range(epoch_size), loss_list, range(epoch_size), acc_list) plt.xlabel('Epoches') plt.ylabel('Sparse CrossEntropy Loss | Accuracy') plt.savefig(os.path.join( os.path.dirname( os.path.abspath(__file__)), args.photo_folder, args.dataset + "_" + args.model + "_train_details.png")) if __name__ == '__main__': train()

程序输入参数说明

dataset：

训练采用的数据集，目前提供 Flower, Oxford-IIIT, CIFAR-10 供选择。点击查看数据集加载Demo

dataset_root:

数据集读取地址, default已设置为数据集相对路径，部署在云端可能需要修改

model:

训练使用的算法模型，目前提供 ResNet, DenseNet 等卷积神经网络

pretrained：

是否使用 PyTorch 预训练权重

crop_size:

数据图像预处理剪裁大小，default为224，只有 LeNet 默认使用 32\times3232×32 尺寸大小

shuffle：

是否重新生成新的train-test数据集样本

batch_size:

单次训练所抓取的数据样本数量，default为32

num_workers:

加载数据所使用线程个数，default为0，n\in (2,4,8,12\dots)n∈(2,4,8,12…)

epoch_size:

训练次数, default为20

cuda:

是否调用GPU训练

超参数学习率，采用Adam优化函数，default为 0.0020.002

save_folder:

模型权重保存地址

程序输出文件说明

训练细节

print 于 python console, 包括单个epoch训练时间、训练集损失值、准确率

模型权重

模型保存路径为 ./weight/{dataset}_{model}.pth

损失函数与正确率

图片保存路径为 ./result/{dataset}_{model}_train_details.png

3.2 模型测试代码

加载同级目录下 test.py 程序代码

# %load test.py import sys import os os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE" import argparse import torch import torch.nn as nn import torch.backends.cudnn as cudnn from torchvision import transforms, datasets from torch.autograd import Variable import itertools import numpy as np import matplotlib as mpl import matplotlib.pyplot as plt mpl.rc('axes', labelsize = 14) mpl.rc('xtick', labelsize = 12) mpl.rc('ytick', labelsize = 12) sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))) from res import build_res50 from dense import build_dense121 from datasets.config import * from datasets.cifar import CIFAR10 parser = argparse.ArgumentParser( description='Convolutional Neural Network Testing With Pytorch') parser.add_argument('--dataset', default='Flower', choices=['Flower', 'Oxford-IIIT', 'CIFAR-10'], type=str, help='Flower, Oxford-IIIT, or CIFAR-10') parser.add_argument('--dataset_root', default=FLOWER_ROOT, help='Dataset root directory path') parser.add_argument('--model', default='ResNet', choices=['ResNet', 'DenseNet'], type=str, help='ResNet or DenseNet') parser.add_argument('--crop_size', default=224, type=int, help='Resized crop value') parser.add_argument('--batch_size', default=32, type=int, help='Batch size for training') parser.add_argument('--num_workers', default=0, type=int, help='Number of workers used in dataloading') parser.add_argument('--weight', default='weights/{}_{}.pth', type=str, help='Trained state_dict file path to open') parser.add_argument('--cuda', default=True, type=bool, help='Use cuda to train model') parser.add_argument('--pretrained', default=True, type=bool, help='Using pretrained model weights') parser.add_argument('-f', default=None, type=str, help="Dummy arg so we can load in Jupyter Notebooks") args = parser.parse_args() args.weight = args.weight.format(args.dataset,args.model) data_transform = transforms.Compose([transforms.RandomResizedCrop(args.crop_size), transforms.RandomHorizontalFlip(), transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]) def confusion_matrix(preds, labels, conf_matrix): for p, t in zip(preds, labels): conf_matrix[p, t] += 1 return conf_matrix def save_confusion_matrix(cm, classes, normalize=False, title='Confusion matrix', cmap=plt.cm.Blues): plt.imshow(cm, interpolation='nearest', cmap=cmap) plt.title(title) plt.colorbar() tick_marks = np.arange(len(classes)) plt.xticks(tick_marks, classes, rotation=90) plt.yticks(tick_marks, classes) plt.axis("equal") ax = plt.gca() left, right = plt.xlim() ax.spines['left'].set_position(('data', left)) ax.spines['right'].set_position(('data', right)) for edge_i in ['top', 'bottom', 'right', 'left']: ax.spines[edge_i].set_edgecolor("white") thresh = cm.max() / 2. for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])): num = '{:.2f}'.format(cm[i, j]) if normalize else int(cm[i, j]) plt.text(j, i, num, verticalalignment='center', horizontalalignment="center", color="white" if num > thresh else "black") plt.ylabel('True label') plt.xlabel('Predicted label') plt.savefig(os.path.join( os.path.dirname( os.path.abspath(__file__)), "results", args.dataset + '_confusion_matrix.png')) def test(): # load data if args.dataset == 'Flower': if not os.path.exists(FLOWER_ROOT): parser.error('Must specify dataset_root if specifying dataset') args.dataset_root = FLOWER_ROOT test_path = os.path.join(FLOWER_ROOT, 'val') if not os.path.exists(test_path): parser.error('Must train models before evaluating') dataset = datasets.ImageFolder(root=test_path,transform=data_transform) if args.dataset == 'Oxford-IIIT': if not os.path.exists(OXFORD_IIIT_ROOT): parser.error('Must specify dataset_root if specifying dataset') args.dataset_root = OXFORD_IIIT_ROOT test_path = os.path.join(OXFORD_IIIT_ROOT, 'val') if not os.path.exists(test_path): parser.error('Must train models before evaluating') dataset = datasets.ImageFolder(root=test_path,transform=data_transform) if args.dataset == 'CIFAR-10': if not os.path.exists(CIFAR_ROOT): parser.error('Must specify dataset_root if specifying dataset') args.dataset_root = CIFAR_ROOT dataset = CIFAR10(train=False,transform=data_transform,target_transform=None) classes = dataset.classes num_classes = len(classes) data_loader = torch.utils.data.DataLoader(dataset, args.batch_size, num_workers=args.num_workers, shuffle=True, pin_memory=True) # load net if args.model == 'ResNet': net = build_res50(phase='train', num_classes=len(classes), pretrained=args.pretrained) if args.model == 'DenseNet': net = build_dense121(phase='train', num_classes=len(classes), pretrained=args.pretrained) if args.cuda and torch.cuda.is_available(): net = torch.nn.DataParallel(net) cudnn.benchmark = True net.cuda() net.load_state_dict(torch.load(args.weight)) print('Finish loading model: ', args.weight) net.eval() print('Training on:', args.dataset) print('Using model:', args.model) print('Using the specified args:') print(args) # evaluation criterion = nn.CrossEntropyLoss() test_loss = 0 correct = 0 total = 0 conf_matrix = torch.zeros(num_classes, num_classes) class_correct = list(0 for i in range(num_classes)) class_total = list(0 for i in range(num_classes)) with torch.no_grad(): for step, data in enumerate(data_loader): images, labels = data if args.cuda: images = Variable(images.cuda()) labels = Variable(labels.cuda()) else: images = Variable(images) labels = Variable(labels) # forward outputs = net(images) loss = criterion(outputs, labels) test_loss += loss.item() _, predicted = outputs.max(1) conf_matrix = confusion_matrix(predicted, labels=labels, conf_matrix=conf_matrix) total += labels.size(0) correct += predicted.eq(labels).sum().item() c = (predicted.eq(labels)).squeeze() for i in range(c.size(0)): label = labels[i] class_correct[label] += c[i].item() class_total[label] += 1 acc = 100.* correct / total loss = test_loss / step print('test loss: %.6f, acc: %.3f%% (%d/%d)' % (loss, acc, correct, total)) for i in range(num_classes): print('accuracy of %s : %.3f%% (%d/%d)' % ( str(classes[i]), 100 * class_correct[i] / class_total[i], class_correct[i], class_total[i])) save_confusion_matrix(conf_matrix.numpy(), classes=classes, normalize=False, title = 'Normalized confusion matrix') if __name__ == '__main__': test()

程序输入参数说明

训练采用的数据集，目前提供 Flower, Oxford-IIIT, CIFAR-10 供选择。点击查看数据集加载Demo

dataset_root:

数据集读取地址, default已设置为数据集相对路径，部署在云端可能需要修改

model:

训练使用的算法模型，目前提供 ResNet, DenseNet 等卷积神经网络

pretrained：

是否使用 PyTorch 预训练权重

crop_size:

数据图像预处理剪裁大小，default为224，只有 LeNet 默认使用 32\times3232×32 尺寸大小

shuffle：

单次训练所抓取的数据样本数量，default为32

num_workers:

加载数据所使用线程个数，default为0，n\in (2,4,8,12\dots)n∈(2,4,8,12…)

trained_model:

模型权重保存路径，default为 train.py 生成的ptb文件路径

cuda:

是否调用GPU训练

程序输出文件说明

测试集损失值与准确率

print 于 python console 第一行

各类别准确率

print 于 python console 后续列表

混淆矩阵

图片保存路径为 ./photos/%_confusion_matrix.png

第4节 Flower 数据集

Flower 数据集来自 Tensorflow 团队，创建于 2019 年 1 月，作为 入门级轻量数据集 包含5个花卉类别 [‘daisy’, ‘dandelion’, ‘roses’, ‘sunflowers’, ‘tulips’]
Flower 数据集是深度学习图像分类中经典的一个数据集，各个类别有 [633, 898, 641, 699, 799] 个样本，每个样本都是一张 320\times232320×232 像素的RGB图片
Dataset 库中的 flower.py 按照 0.1 的比例实现训练集与测试集的 样本分离

4.1 ResNet

%run train.py --dataset Flower --model ResNet --pretrained True

Loading the dataset...
Training on: Flower
Using model: ResNet
Using the specified args:
Namespace(batch_size=32, crop_size=224, cuda=True, dataset='Flower', dataset_root='C:\\Users\\sbzy\\Documents/GitHub/dl_algorithm/datasets\\FLOWER', epoch_size=20, lr=0.0002, model='ResNet', num_workers=0, photo_folder='results/', pretrained=True, save_folder='weights/', shuffle=False)
Epoch 1: 100%[**************************************************->]0.875  Running time: 26.564
train loss: 1.081059, acc: 63.733% (2107/3306)
Epoch 2: 100%[**************************************************->]0.601  Running time: 25.509
train loss: 0.677766, acc: 78.736% (2603/3306)
Epoch 3: 100%[**************************************************->]0.456  Running time: 26.531
train loss: 0.559504, acc: 81.942% (2709/3306)
Epoch 4: 100%[**************************************************->]1.432  Running time: 26.556
train loss: 0.520002, acc: 82.638% (2732/3306)
Epoch 5: 100%[**************************************************->]0.427  Running time: 26.434
train loss: 0.470244, acc: 83.666% (2766/3306)
Epoch 6: 100%[**************************************************->]0.331  Running time: 25.863
train loss: 0.466301, acc: 84.392% (2790/3306)
Epoch 7: 100%[**************************************************->]0.640  Running time: 26.079
train loss: 0.425827, acc: 85.390% (2823/3306)
Epoch 8: 100%[**************************************************->]0.568  Running time: 26.056
train loss: 0.415019, acc: 85.935% (2841/3306)
Epoch 9: 100%[**************************************************->]0.174  Running time: 26.167
train loss: 0.389266, acc: 86.388% (2856/3306)
Epoch 10: 100%[**************************************************->]0.865  Running time: 26.185
train loss: 0.396210, acc: 86.842% (2871/3306)
Epoch 11: 100%[**************************************************->]0.326  Running time: 25.983
train loss: 0.401938, acc: 86.449% (2858/3306)
Epoch 12: 100%[**************************************************->]0.729  Running time: 26.115
train loss: 0.397555, acc: 86.570% (2862/3306)
Epoch 13: 100%[**************************************************->]0.832  Running time: 25.905
train loss: 0.377356, acc: 86.903% (2873/3306)
Epoch 14: 100%[**************************************************->]0.333  Running time: 26.271
train loss: 0.369491, acc: 87.447% (2891/3306)
Epoch 15: 100%[**************************************************->]0.061  Running time: 25.992
train loss: 0.378465, acc: 87.266% (2885/3306)
Epoch 16: 100%[**************************************************->]0.879  Running time: 26.118
train loss: 0.355365, acc: 88.052% (2911/3306)
Epoch 17: 100%[**************************************************->]0.665  Running time: 26.128
train loss: 0.364475, acc: 87.175% (2882/3306)
Epoch 18: 100%[**************************************************->]0.498  Running time: 25.985
train loss: 0.349165, acc: 88.294% (2919/3306)
Epoch 19: 100%[**************************************************->]0.761  Running time: 26.108
train loss: 0.358627, acc: 87.598% (2896/3306)
Epoch 20: 100%[**************************************************->]0.241  Running time: 26.108
train loss: 0.331572, acc: 89.080% (2945/3306)

%run test.py --dataset Flower --model ResNet --pretrained True

Finish loading model:  weights/Flower_ResNet.pth
Training on: Flower
Using model: ResNet
Using the specified args:
Namespace(batch_size=32, crop_size=224, cuda=True, dataset='Flower', dataset_root='C:\\Users\\sbzy\\Documents/GitHub/dl_algorithm/datasets\\FLOWER', f=None, model='ResNet', num_workers=0, pretrained=True, weight='weights/Flower_ResNet.pth')
test loss: 0.305797, acc: 90.110% (328/364)
accuracy of daisy : 87.302% (55/63)
accuracy of dandelion : 92.135% (82/89)
accuracy of roses : 90.625% (58/64)
accuracy of sunflowers : 89.855% (62/69)
accuracy of tulips : 89.873% (71/79)

4.2 DenseNet

%run train.py --dataset Flower --model DenseNet --pretrained True

Loading the dataset...
Training on: Flower
Using model: DenseNet
Using the specified args:
Namespace(batch_size=32, crop_size=224, cuda=True, dataset='Flower', dataset_root='C:\\Users\\sbzy\\Documents/GitHub/dl_algorithm/datasets\\FLOWER', epoch_size=20, lr=0.0002, model='DenseNet', num_workers=0, photo_folder='results/', pretrained=True, save_folder='weights/', shuffle=False)
Epoch 1: 100%[**************************************************->]1.358  Running time: 28.533
train loss: 1.479844, acc: 38.869% (1285/3306)
Epoch 2: 100%[**************************************************->]1.313  Running time: 26.414
train loss: 1.317668, acc: 46.854% (1549/3306)
Epoch 3: 100%[**************************************************->]1.420  Running time: 26.416
train loss: 1.242020, acc: 49.819% (1647/3306)
Epoch 4: 100%[**************************************************->]1.200  Running time: 26.329
train loss: 1.199207, acc: 51.754% (1711/3306)
Epoch 5: 100%[**************************************************->]1.288  Running time: 26.556
train loss: 1.163759, acc: 54.809% (1812/3306)
Epoch 6: 100%[**************************************************->]0.689  Running time: 26.803
train loss: 1.138236, acc: 55.203% (1825/3306)
Epoch 7: 100%[**************************************************->]0.803  Running time: 27.322
train loss: 1.122978, acc: 55.596% (1838/3306)
Epoch 8: 100%[**************************************************->]1.180  Running time: 26.789
train loss: 1.102592, acc: 57.350% (1896/3306)
Epoch 9: 100%[**************************************************->]1.098  Running time: 26.681
train loss: 1.107488, acc: 56.594% (1871/3306)
Epoch 10: 100%[**************************************************->]1.315  Running time: 26.620
train loss: 1.065998, acc: 58.288% (1927/3306)
Epoch 11: 100%[**************************************************->]1.490  Running time: 27.548
train loss: 1.078985, acc: 57.864% (1913/3306)
Epoch 12: 100%[**************************************************->]1.067  Running time: 26.945
train loss: 1.056432, acc: 59.679% (1973/3306)
Epoch 13: 100%[**************************************************->]1.428  Running time: 26.677
train loss: 1.055029, acc: 59.952% (1982/3306)
Epoch 14: 100%[**************************************************->]0.811  Running time: 26.552
train loss: 1.063877, acc: 60.284% (1993/3306)
Epoch 15: 100%[**************************************************->]1.409  Running time: 26.576
train loss: 1.037472, acc: 60.315% (1994/3306)
Epoch 16: 100%[**************************************************->]1.187  Running time: 26.102
train loss: 1.026487, acc: 60.950% (2015/3306)
Epoch 17: 100%[**************************************************->]1.233  Running time: 26.227
train loss: 1.031807, acc: 60.738% (2008/3306)
Epoch 18: 100%[**************************************************->]1.221  Running time: 26.052
train loss: 1.002148, acc: 61.464% (2032/3306)
Epoch 19: 100%[**************************************************->]0.771  Running time: 26.276
train loss: 0.998473, acc: 62.129% (2054/3306)
Epoch 20: 100%[**************************************************->]1.198  Running time: 26.112
train loss: 0.992600, acc: 62.855% (2078/3306)

%run test.py --dataset Flower --model DenseNet --pretrained True

Finish loading model:  weights/Flower_DenseNet.pth
Training on: Flower
Using model: DenseNet
Using the specified args:
Namespace(batch_size=32, crop_size=224, cuda=True, dataset='Flower', dataset_root='C:\\Users\\sbzy\\Documents/GitHub/dl_algorithm/datasets\\FLOWER', f=None, model='DenseNet', num_workers=0, pretrained=True, weight='weights/Flower_DenseNet.pth')
test loss: 1.103226, acc: 64.286% (234/364)
accuracy of daisy : 55.556% (35/63)
accuracy of dandelion : 82.022% (73/89)
accuracy of roses : 35.938% (23/64)
accuracy of sunflowers : 76.812% (53/69)
accuracy of tulips : 63.291% (50/79)

第5节 Oxford-IIIT 数据集

Oxford-IIIT 数据集覆盖 30 个种类的猫狗品种，每个类别收集了约 200 张图像样本
Oxford-IIIT 中每张图像在尺寸、姿势、光暗程度上有很大的浮动，但所有的图像都匹配了相关联的品种、头部框架定位、三维像素点语义分割的标注信息
Dataset 库中的 oxford_iiit.py 按照 0.1 的比例实现训练集与测试集的样本分离

5.1 ResNet

%run train.py --dataset Oxford-IIIT --model ResNet --pretrained True

Loading the dataset...
Training on: Oxford-IIIT
Using model: ResNet
Using the specified args:
Namespace(batch_size=32, crop_size=224, cuda=True, dataset='Oxford-IIIT', dataset_root='C:\\Users\\sbzy\\Documents/GitHub/dl_algorithm/datasets\\OXFORD-IIIT', epoch_size=20, lr=0.0002, model='ResNet', num_workers=0, photo_folder='results/', pretrained=True, save_folder='weights/', shuffle=False)
Epoch 1: 100%[**************************************************->]1.672  Running time: 52.758
train loss: 2.521808, acc: 38.588% (2083/5398)
Epoch 2: 100%[**************************************************->]1.263  Running time: 50.631
train loss: 1.373019, acc: 66.154% (3571/5398)
Epoch 3: 100%[**************************************************->]1.509  Running time: 50.305
train loss: 1.020504, acc: 72.305% (3903/5398)
Epoch 4: 100%[**************************************************->]0.838  Running time: 49.587
train loss: 0.868986, acc: 75.213% (4060/5398)
Epoch 5: 100%[**************************************************->]0.843  Running time: 50.557
train loss: 0.793433, acc: 77.418% (4179/5398)
Epoch 6: 100%[**************************************************->]0.902  Running time: 51.422
train loss: 0.749010, acc: 77.306% (4173/5398)
Epoch 7: 100%[**************************************************->]1.209  Running time: 50.523
train loss: 0.707766, acc: 78.770% (4252/5398)
Epoch 8: 100%[**************************************************->]0.759  Running time: 50.358
train loss: 0.655945, acc: 80.196% (4329/5398)
Epoch 9: 100%[**************************************************->]0.902  Running time: 50.031
train loss: 0.645212, acc: 80.511% (4346/5398)
Epoch 10: 100%[**************************************************->]0.490  Running time: 49.170
train loss: 0.634509, acc: 80.548% (4348/5398)
Epoch 11: 100%[**************************************************->]0.556  Running time: 49.356
train loss: 0.615873, acc: 80.474% (4344/5398)
Epoch 12: 100%[**************************************************->]0.503  Running time: 50.010
train loss: 0.614606, acc: 80.622% (4352/5398)
Epoch 13: 100%[**************************************************->]0.356  Running time: 49.389
train loss: 0.591290, acc: 82.049% (4429/5398)
Epoch 14: 100%[**************************************************->]0.484  Running time: 49.485
train loss: 0.576546, acc: 82.179% (4436/5398)
Epoch 15: 100%[**************************************************->]0.473  Running time: 49.611
train loss: 0.573398, acc: 81.975% (4425/5398)
Epoch 16: 100%[**************************************************->]0.656  Running time: 50.217
train loss: 0.584155, acc: 81.938% (4423/5398)
Epoch 17: 100%[**************************************************->]0.820  Running time: 50.045
train loss: 0.576462, acc: 82.104% (4432/5398)
Epoch 18: 100%[**************************************************->]0.566  Running time: 49.894
train loss: 0.589368, acc: 81.308% (4389/5398)
Epoch 19: 100%[**************************************************->]0.439  Running time: 49.606
train loss: 0.539627, acc: 83.938% (4531/5398)
Epoch 20: 100%[**************************************************->]0.558  Running time: 50.065
train loss: 0.556025, acc: 83.012% (4481/5398)

%run test.py --dataset Oxford-IIIT --model ResNet --pretrained True

Finish loading model:  weights/Oxford-IIIT_ResNet.pth
Training on: Oxford-IIIT
Using model: ResNet
Using the specified args:
Namespace(batch_size=32, crop_size=224, cuda=True, dataset='Oxford-IIIT', dataset_root='C:\\Users\\sbzy\\Documents/GitHub/dl_algorithm/datasets\\OXFORD-IIIT', f=None, model='ResNet', num_workers=0, pretrained=True, weight='weights/Oxford-IIIT_ResNet.pth')
test loss: 0.455718, acc: 86.978% (521/599)
accuracy of Abyssinian : 95.000% (19/20)
accuracy of American_Bulldog : 80.000% (16/20)
accuracy of American_Pit_Bull_Terrier : 70.000% (14/20)
accuracy of Basset_Hound : 80.000% (16/20)
accuracy of Beagle : 90.000% (18/20)
accuracy of Bengal : 85.000% (17/20)
accuracy of Birman : 95.000% (19/20)
accuracy of Bombay : 100.000% (20/20)
accuracy of Boxer : 85.000% (17/20)
accuracy of British_Shorthair : 75.000% (15/20)
accuracy of Chihuahua : 80.000% (16/20)
accuracy of Egyptian_Mau : 85.000% (17/20)
accuracy of English_Cocker_Spaniel : 100.000% (20/20)
accuracy of English_Setter : 85.000% (17/20)
accuracy of German_Shorthaired : 95.000% (19/20)
accuracy of Great_Pyrenees : 95.000% (19/20)
accuracy of Havanese : 90.000% (18/20)
accuracy of Japanese_Chin : 90.000% (18/20)
accuracy of Keeshond : 85.000% (17/20)
accuracy of Leonberger : 95.000% (19/20)
accuracy of Maine_Coon : 75.000% (15/20)
accuracy of Miniature_Pinscher : 85.000% (17/20)
accuracy of Newfoundland : 95.000% (19/20)
accuracy of Persian : 95.000% (19/20)
accuracy of Pomeranian : 85.000% (17/20)
accuracy of Pug : 70.000% (14/20)
accuracy of Ragdoll : 80.000% (16/20)
accuracy of Russian_Blue : 85.000% (17/20)
accuracy of Saint_Bernard : 95.000% (19/20)
accuracy of Samoyed : 89.474% (17/19)

5.2 DenseNet

%run train.py --dataset Oxford-IIIT --model DenseNet --pretrained True

Loading the dataset...
Training on: Oxford-IIIT
Using model: DenseNet
Using the specified args:
Namespace(batch_size=32, crop_size=224, cuda=True, dataset='Oxford-IIIT', dataset_root='C:\\Users\\sbzy\\Documents/GitHub/dl_algorithm/datasets\\OXFORD-IIIT', epoch_size=20, lr=0.0002, model='DenseNet', num_workers=0, photo_folder='results/', pretrained=True, save_folder='weights/', shuffle=False)
Epoch 1: 100%[**************************************************->]3.417  Running time: 52.623
train loss: 3.389321, acc: 5.947% (321/5398)
Epoch 2: 100%[**************************************************->]3.239  Running time: 51.639
train loss: 3.324268, acc: 7.873% (425/5398)
Epoch 3: 100%[**************************************************->]3.194  Running time: 51.499
train loss: 3.275900, acc: 9.392% (507/5398)
Epoch 4: 100%[**************************************************->]3.118  Running time: 51.520
train loss: 3.237189, acc: 10.634% (574/5398)
Epoch 5: 100%[**************************************************->]3.091  Running time: 51.456
train loss: 3.192028, acc: 11.801% (637/5398)
Epoch 6: 100%[**************************************************->]3.105  Running time: 52.150
train loss: 3.159566, acc: 12.523% (676/5398)
Epoch 7: 100%[**************************************************->]3.357  Running time: 51.849
train loss: 3.129416, acc: 13.431% (725/5398)
Epoch 8: 100%[**************************************************->]3.291  Running time: 51.711
train loss: 3.114922, acc: 13.357% (721/5398)
Epoch 9: 100%[**************************************************->]3.088  Running time: 50.402
train loss: 3.082170, acc: 14.357% (775/5398)
Epoch 10: 100%[**************************************************->]2.907  Running time: 52.089
train loss: 3.059112, acc: 14.783% (798/5398)
Epoch 11: 100%[**************************************************->]3.184  Running time: 51.797
train loss: 3.043412, acc: 14.598% (788/5398)
Epoch 12: 100%[**************************************************->]2.944  Running time: 51.530
train loss: 3.036861, acc: 15.969% (862/5398)
Epoch 13: 100%[**************************************************->]2.953  Running time: 51.445
train loss: 3.015575, acc: 15.932% (860/5398)
Epoch 14: 100%[**************************************************->]2.936  Running time: 51.216
train loss: 3.003686, acc: 15.895% (858/5398)
Epoch 15: 100%[**************************************************->]3.334  Running time: 50.708
train loss: 2.990475, acc: 16.803% (907/5398)
Epoch 16: 100%[**************************************************->]3.245  Running time: 51.944
train loss: 2.975292, acc: 16.673% (900/5398)
Epoch 17: 100%[**************************************************->]2.933  Running time: 51.413
train loss: 2.960971, acc: 17.877% (965/5398)
Epoch 18: 100%[**************************************************->]2.980  Running time: 51.584
train loss: 2.959703, acc: 16.932% (914/5398)
Epoch 19: 100%[**************************************************->]2.703  Running time: 50.915
train loss: 2.943565, acc: 17.451% (942/5398)
Epoch 20: 100%[**************************************************->]2.789  Running time: 50.066
train loss: 2.923782, acc: 17.673% (954/5398)

%run test.py --dataset Oxford-IIIT --model DenseNet --pretrained True

Finish loading model:  weights/Oxford-IIIT_DenseNet.pth
Training on: Oxford-IIIT
Using model: DenseNet
Using the specified args:
Namespace(batch_size=32, crop_size=224, cuda=True, dataset='Oxford-IIIT', dataset_root='C:\\Users\\sbzy\\Documents/GitHub/dl_algorithm/datasets\\OXFORD-IIIT', f=None, model='DenseNet', num_workers=0, pretrained=True, weight='weights/Oxford-IIIT_DenseNet.pth')
test loss: 3.210763, acc: 14.524% (87/599)
accuracy of Abyssinian : 15.000% (3/20)
accuracy of American_Bulldog : 0.000% (0/20)
accuracy of American_Pit_Bull_Terrier : 0.000% (0/20)
accuracy of Basset_Hound : 15.000% (3/20)
accuracy of Beagle : 5.000% (1/20)
accuracy of Bengal : 15.000% (3/20)
accuracy of Birman : 5.000% (1/20)
accuracy of Bombay : 55.000% (11/20)
accuracy of Boxer : 5.000% (1/20)
accuracy of British_Shorthair : 20.000% (4/20)
accuracy of Chihuahua : 10.000% (2/20)
accuracy of Egyptian_Mau : 40.000% (8/20)
accuracy of English_Cocker_Spaniel : 0.000% (0/20)
accuracy of English_Setter : 0.000% (0/20)
accuracy of German_Shorthaired : 40.000% (8/20)
accuracy of Great_Pyrenees : 10.000% (2/20)
accuracy of Havanese : 25.000% (5/20)
accuracy of Japanese_Chin : 15.000% (3/20)
accuracy of Keeshond : 20.000% (4/20)
accuracy of Leonberger : 5.000% (1/20)
accuracy of Maine_Coon : 5.000% (1/20)
accuracy of Miniature_Pinscher : 0.000% (0/20)
accuracy of Newfoundland : 15.000% (3/20)
accuracy of Persian : 0.000% (0/20)
accuracy of Pomeranian : 20.000% (4/20)
accuracy of Pug : 5.000% (1/20)
accuracy of Ragdoll : 35.000% (7/20)
accuracy of Russian_Blue : 20.000% (4/20)
accuracy of Saint_Bernard : 15.000% (3/20)
accuracy of Samoyed : 21.053% (4/19)

第6节 CIFAR-10 数据集

CIFAR-10 数据集是 Visual Dictionary (Teaching computers to recognize objects) 的子集，由三个多伦多大学教授收集，主要来自Google和各类搜索引擎的图片
CIFAR-10 数据集包含 60000 张 32\times3232×32 的RBG彩色图像，共计 10 个包含 6000 张样本图像的不同类别，训练集包含 50000 张图像样本，测试集包含 10000 张图像样本
CIFAR-10 数据集在深度学习初期 (ImageNet 问世前) 一直是衡量各种算法模型的 benchmark，但其 32\times3232×32 的图像尺寸逐渐无法满足日渐飞速迭代的神经网络结构

6.1 ResNet

%run train.py --dataset CIFAR-10 --model ResNet --pretrained True

Loading the dataset...
Training on: CIFAR-10
Using model: ResNet
Using the specified args:
Namespace(batch_size=32, crop_size=224, cuda=True, dataset='CIFAR-10', dataset_root='C:\\Users\\sbzy\\Documents/GitHub/dl_algorithm/datasets\\CIFAR-10', epoch_size=20, lr=0.0002, model='ResNet', num_workers=0, photo_folder='results/', pretrained=True, save_folder='weights/', shuffle=False)
Epoch 1: 100%[**************************************************->]1.182  Running time: 268.943
train loss: 1.536603, acc: 46.160% (23080/50000)
Epoch 2: 100%[**************************************************->]1.599  Running time: 272.843
train loss: 1.306745, acc: 54.256% (27128/50000)
Epoch 3: 100%[**************************************************->]0.755  Running time: 271.827
train loss: 1.264989, acc: 55.674% (27837/50000)
Epoch 4: 100%[**************************************************->]1.132  Running time: 271.198
train loss: 1.244303, acc: 56.414% (28207/50000)
Epoch 5: 100%[**************************************************->]0.973  Running time: 270.233
train loss: 1.222554, acc: 56.928% (28464/50000)
Epoch 6: 100%[**************************************************->]1.087  Running time: 275.036
train loss: 1.217354, acc: 57.220% (28610/50000)
Epoch 7: 100%[**************************************************->]0.884  Running time: 274.613
train loss: 1.210398, acc: 57.528% (28764/50000)
Epoch 8: 100%[**************************************************->]1.285  Running time: 273.153
train loss: 1.196946, acc: 57.820% (28910/50000)
Epoch 9: 100%[**************************************************->]1.259  Running time: 273.424
train loss: 1.185041, acc: 58.506% (29253/50000)
Epoch 10: 100%[**************************************************->]1.548  Running time: 273.988
train loss: 1.181838, acc: 58.526% (29263/50000)
Epoch 11: 100%[**************************************************->]1.346  Running time: 273.759
train loss: 1.178121, acc: 58.676% (29338/50000)
Epoch 12: 100%[**************************************************->]0.835  Running time: 275.765
train loss: 1.170078, acc: 59.114% (29557/50000)
Epoch 13: 100%[**************************************************->]1.329  Running time: 272.255
train loss: 1.169670, acc: 58.960% (29480/50000)
Epoch 14: 100%[**************************************************->]1.506  Running time: 272.848
train loss: 1.168470, acc: 59.054% (29527/50000)
Epoch 15: 100%[**************************************************->]1.494  Running time: 270.235
train loss: 1.162010, acc: 59.134% (29567/50000)
Epoch 16: 100%[**************************************************->]1.065  Running time: 270.341
train loss: 1.157648, acc: 59.338% (29669/50000)
Epoch 17: 100%[**************************************************->]1.498  Running time: 271.751
train loss: 1.155461, acc: 59.522% (29761/50000)
Epoch 18: 100%[**************************************************->]0.791  Running time: 274.548
train loss: 1.151811, acc: 59.386% (29693/50000)
Epoch 19: 100%[**************************************************->]1.116  Running time: 274.017
train loss: 1.153108, acc: 59.550% (29775/50000)
Epoch 20: 100%[**************************************************->]1.234  Running time: 275.824
train loss: 1.150793, acc: 59.674% (29837/50000)

%run test.py --dataset CIFAR-10 --model ResNet --pretrained True

Finish loading model:  weights/CIFAR-10_ResNet.pth
Training on: CIFAR-10
Using model: ResNet
Using the specified args:
Namespace(batch_size=32, crop_size=224, cuda=True, dataset='CIFAR-10', dataset_root='C:\\Users\\sbzy\\Documents/GitHub/dl_algorithm/datasets\\CIFAR-10', f=None, model='ResNet', num_workers=0, pretrained=True, weight='weights/CIFAR-10_ResNet.pth')
test loss: 1.023940, acc: 63.890% (6389/10000)
accuracy of airplane : 65.500% (655/1000)
accuracy of automobile : 66.100% (661/1000)
accuracy of bird : 49.400% (494/1000)
accuracy of cat : 54.700% (547/1000)
accuracy of deer : 58.200% (582/1000)
accuracy of dog : 55.600% (556/1000)
accuracy of frog : 77.300% (773/1000)
accuracy of horse : 65.000% (650/1000)
accuracy of ship : 74.000% (740/1000)
accuracy of truck : 73.100% (731/1000)

6.2 DenseNet

%run train.py --dataset CIFAR-10 --model DenseNet --pretrained True

Loading the dataset...
Training on: CIFAR-10
Using model: DenseNet
Using the specified args:
Namespace(batch_size=32, crop_size=224, cuda=True, dataset='CIFAR-10', dataset_root='C:\\Users\\sbzy\\Documents/GitHub/dl_algorithm/datasets\\CIFAR-10', epoch_size=20, lr=0.0002, model='DenseNet', num_workers=0, photo_folder='results/', pretrained=True, save_folder='weights/', shuffle=False)
Epoch 1: 100%[**************************************************->]2.152  Running time: 281.216
train loss: 2.011725, acc: 25.864% (12932/50000)
Epoch 2: 100%[**************************************************->]1.745  Running time: 279.927
train loss: 1.880436, acc: 31.006% (15503/50000)
Epoch 3: 100%[**************************************************->]1.949  Running time: 328.740
train loss: 1.832803, acc: 33.142% (16571/50000)
Epoch 4: 100%[**************************************************->]1.841  Running time: 307.168
train loss: 1.796165, acc: 34.670% (17335/50000)
Epoch 5: 100%[**************************************************->]2.095  Running time: 279.075
train loss: 1.776985, acc: 35.518% (17759/50000)
Epoch 6: 100%[**************************************************->]1.631  Running time: 278.310
train loss: 1.759114, acc: 36.102% (18051/50000)
Epoch 7: 100%[**************************************************->]2.244  Running time: 279.672
train loss: 1.742437, acc: 36.962% (18481/50000)
Epoch 8: 100%[**************************************************->]1.967  Running time: 278.730
train loss: 1.728809, acc: 37.422% (18711/50000)
Epoch 9: 100%[**************************************************->]1.609  Running time: 282.537
train loss: 1.716504, acc: 38.170% (19085/50000)
Epoch 10: 100%[**************************************************->]1.459  Running time: 280.416
train loss: 1.711381, acc: 38.320% (19160/50000)
Epoch 11: 100%[**************************************************->]1.418  Running time: 279.299
train loss: 1.705938, acc: 38.524% (19262/50000)
Epoch 12: 100%[**************************************************->]1.376  Running time: 279.639
train loss: 1.689652, acc: 38.984% (19492/50000)
Epoch 13: 100%[**************************************************->]1.685  Running time: 278.627
train loss: 1.692426, acc: 39.058% (19529/50000)
Epoch 14: 100%[**************************************************->]1.587  Running time: 279.243
train loss: 1.680914, acc: 39.208% (19604/50000)
Epoch 15: 100%[**************************************************->]1.306  Running time: 277.855
train loss: 1.683740, acc: 39.166% (19583/50000)
Epoch 16: 100%[**************************************************->]1.839  Running time: 279.118
train loss: 1.668011, acc: 39.766% (19883/50000)
Epoch 17: 100%[**************************************************->]2.164  Running time: 278.432
train loss: 1.662733, acc: 40.276% (20138/50000)
Epoch 18: 100%[**************************************************->]2.060  Running time: 280.945
train loss: 1.659158, acc: 40.250% (20125/50000)
Epoch 19: 100%[**************************************************->]1.370  Running time: 280.071
train loss: 1.656844, acc: 40.446% (20223/50000)
Epoch 20: 100%[**************************************************->]1.381  Running time: 278.043
train loss: 1.648796, acc: 41.018% (20509/50000)

%run test.py --dataset CIFAR-10 --model DenseNet --pretrained True

Finish loading model:  weights/CIFAR-10_DenseNet.pth
Training on: CIFAR-10
Using model: DenseNet
Using the specified args:
Namespace(batch_size=32, crop_size=224, cuda=True, dataset='CIFAR-10', dataset_root='C:\\Users\\sbzy\\Documents/GitHub/dl_algorithm/datasets\\CIFAR-10', f=None, model='DenseNet', num_workers=0, pretrained=True, weight='weights/CIFAR-10_DenseNet.pth')
test loss: 1.615522, acc: 42.230% (4223/10000)
accuracy of airplane : 47.600% (476/1000)
accuracy of automobile : 60.400% (604/1000)
accuracy of bird : 22.100% (221/1000)
accuracy of cat : 38.400% (384/1000)
accuracy of deer : 39.800% (398/1000)
accuracy of dog : 38.500% (385/1000)
accuracy of frog : 54.000% (540/1000)
accuracy of horse : 34.100% (341/1000)
accuracy of ship : 45.400% (454/1000)
accuracy of truck : 42.000% (420/1000)

开始实验

你可能感兴趣的:(Java,Python高级,Python基础,cnn,深度学习,神经网络)

Python深度学习实践：LSTM与GRU在序列数据预测中的应用 AI智能应用 Python入门实战计算科学神经计算深度学习神经网络大数据人工智能大型语言模型 AI AGI LLM Java Python 架构设计 Agent RPA
Python深度学习实践：LSTM与GRU在序列数据预测中的应用作者：禅与计算机程序设计艺术/ZenandtheArtofComputerProgramming1.背景介绍1.1问题的由来序列数据预测是机器学习领域的一个重要研究方向，涉及时间序列分析、自然语言处理、语音识别等多个领域。序列数据具有时间依赖性，即序列中每个元素都受到前面元素的影响。传统的机器学习算法难以捕捉这种时间依赖性，而深度学习
关于Java中的private final、static修饰的方法讴歌oge java 开发语言
privatefinal修饰的方法示例代码：classCarextendsVehicle{publicstaticvoidmain(String[]args){newCar().run();//创建Car实例并调用run()方法}privatefinalvoidrun(){System.out.println("Car");//打印"Car"}}classVehicle{privatefinalv
10.jobManager初始化流程
JobManager初始化流程1.找到入口类StandaloneSessionClusterEntrypoint该类位于Flink源码的以下路径中：flink-runtime/src/main/java/org/apache/flink/runtime/entrypoint/StandaloneSessionClusterEntrypoint.java2.查看main方法/**Entrypoint
Python高级数据类型：字典（Dictionary） PythonicCC python 开发语言
字典是Python中非常重要且实用的数据结构，本文将全面详细地介绍字典的所有知识点，从基础概念到高级用法，帮助初学者彻底掌握字典的使用。1.字典简介1.1为什么需要字典？假设我们需要存储公司员工的姓名、年龄、职务和工资信息。使用列表可以这样实现：staff_list=[["tom",20,"teacher",6000],["rose",18,"hr",5000],["jack",20,"行政",4
Java中字符串的创建过程及intern()方法讴歌oge java 开发语言 String intern StringBuilder
一、字符串的创建过程1.Strings="abc"首先在字符串常量池中查找是否有"abc"如果常量池中没有"abc"，则创建一个"abc"对象放入常量池，进行下一步；如果有，直接进行下一步变量s指向常量池中的"abc"对象2.Strings=newString("abc");创建过程：首先在字符串常量池中查找是否有"abc"如果常量池中没有"abc"，则创建一个"abc"对象放入常量池，进行下一步
Baumer工业相机堡盟工业相机如何通过YoloV8深度学习模型实现打架检测（C#代码，UI界面版）格林威工业相机机器视觉数码相机 YOLO 深度学习计算机视觉人工智能
Baumer工业相机堡盟工业相机如何通过YoloV8深度学习模型实现打架检测（C#代码，UI界面版）工业相机使用YoloV8模型实现打架检测工业相机通过YoloV8模型实现打架检测的技术背景在相机SDK中获取图像转换图像的代码分析工业相机图像转换Bitmap图像格式和Mat图像重要核心代码本地文件图像转换Bitmap图像格式和Mat图像重要核心代码Mat图像导入YoloV8模型重要核心代码代码实现
Baumer工业相机堡盟工业相机如何通过YoloV8深度学习模型实现人脸识别检测（C#代码，UI界面版）格林威机器视觉工业相机数码相机 YOLO 深度学习人工智能视觉检测 c#
Baumer工业相机堡盟工业相机如何通过YoloV8深度学习模型实现人脸识别检测（C#代码，UI界面版）工业相机使用YoloV8模型实现人脸的检测工业相机通过YoloV8模型实现人脸识别检测的技术背景在相机SDK中获取图像转换图像的代码分析工业相机图像转换Bitmap图像格式和Mat图像重要核心代码本地文件图像转换Bitmap图像格式和Mat图像重要核心代码Mat图像导入YoloV8模型重要核心代
Baumer工业相机堡盟工业相机如何通过YoloV8深度学习模型实现人物识别（C#代码，UI界面版）格林威工业相机机器视觉数码相机 YOLO c#人工智能计算机视觉开发语言
Baumer工业相机堡盟工业相机如何通过YoloV8深度学习模型实现人物识别（C#代码，UI界面版）工业相机使用YoloV8模型实现人物识别工业相机实现YoloV8模型实现人物识别的技术背景在相机SDK中获取图像转换图像的代码分析工业相机图像转换Bitmap图像格式和Mat图像重要核心代码本地文件图像转换Bitmap图像格式和Mat图像重要核心代码Mat图像导入YoloV8模型重要核心代码代码实现
Baumer工业相机堡盟工业相机如何通过YoloV8深度学习模型实现动物分类（C#源码，UI界面版）格林威机器视觉工业相机数码相机 YOLO 深度学习计算机视觉人工智能视觉检测 c#
Baumer工业相机堡盟工业相机如何通过YoloV8深度学习模型实现动物分类（C#源码，UI界面版））工业相机使用YoloV8模型实现动物分类工业相机实现YoloV8模型实现动物分类的技术背景在相机SDK中获取图像转换图像的代码分析工业相机图像转换Bitmap图像格式和Mat图像重要核心代码本地文件图像转换Bitmap图像格式和Mat图像重要核心代码Mat图像导入YoloV8模型重要核心代码代码实
鸿蒙与web混合开发双向通信屿筱鸿蒙 HarmonyOS5
鸿蒙与web混合开发双向通信用runJavaScript和registerJavaScriptProxywebentry/src/main/resources/rawfile/1.html混合开发打开相册//直接写js代码functionchangeImg(){//1.获取img这个元素constimg=document.querySelector('img')//2.修改元素的属性img.src
java毕业设计-基于Javaweb的家常小菜烹饪学习管理系统的设计与实现(源码+LW+部署文档+全bao+远程调试+代码讲解等) 程序猿刘 vue spring boot 毕业设计 java 课程设计学习
博主介绍：✌️码农一枚，专注于大学生项目实战开发、讲解和毕业文撰写修改等。全栈领域优质创作者，博客之星、掘金/华为云/阿里云/InfoQ等平台优质作者、专注于Java、小程序技术领域和毕业项目实战✌️技术范围：：小程序、SpringBoot、SSM、JSP、Vue、PHP、Java、python、爬虫、数据可视化、大数据、物联网、机器学习等设计与开发。主要内容：免费开题报告、任务书、全bao定制+
java毕业设计源码案例-基于ssm+协同过滤的个性化小说推荐系统设计与实现(源码+LW+部署文档+全bao+远程调试+代码讲解等) 项目帮 springboot java 计算机毕设 java 课程设计开发语言
博主介绍：✌️码农一枚，专注于大学生项目实战开发、讲解和毕业文撰写修改等。全栈领域优质创作者，博客之星、掘金/华为云/阿里云/InfoQ等平台优质作者、专注于Java、小程序技术领域和毕业项目实战✌️技术范围：：小程序、SpringBoot、SSM、JSP、Vue、PHP、Java、python、爬虫、数据可视化、大数据、物联网、机器学习等设计与开发。主要内容：免费功能设计，开题报告、任务书、全b
AI 大模型重塑软件开发流程万花丛中一抹绿人工智能
一、AI大模型的定义与发展历史AI大模型是基于海量数据训练的深度学习模型，具备强大的自然语言理解、逻辑推理和知识生成能力。在软件开发领域，以GPT-4、CodeLlama、GitHubCopilotX为代表的大模型，能理解代码语法、语义及业务逻辑，实现代码生成、漏洞检测等复杂任务。其发展可追溯至2017年，谷歌提出Transformer架构，为大模型奠定了核心基础。2018年，GPT-1问世，参数
Javascript 异步编程（三）定时器夏末远歌
Javascript异步编程（三）并行？并发？异步？同步：synchronous:指所有任务按出现的先后顺序依次执行如果出现阻塞的任务，那么线程就会等待这个任务完成，接着执行下一个任务。异步：asynchronous:不保证所有任务按出现的顺序执行并发：concurrent:从宏观上，某个时间段里面多个程序都得到了运行，但不是说“同时运行”并行：parallel：在多核心下，因进程和线程独立运行，
PyTorch笔记6----------神经网络案例 HuashuiMu花水木 PyTorch笔记 pytorch 笔记
1.回归网络波士顿房价预测模型搭建波士顿房价数据集下载链接：百度网盘请输入提取码提取码:5279导入所需包importtorchimportnumpyasnpimportre读取数据ff=open('housing.data').readlines()data=[]foriteminff:out=re.sub(r"\s{2,}","",item).strip()#通过正则表达式去除所有空格data
springboot+vue生态系统的气象数据可视化平台Java+python-计算机毕业设计
目录功能和技术介绍具体实现截图开发核心技术：开发环境开发步骤编译运行核心代码部分展示系统设计详细视频演示可行性论证软件测试源码获取功能和技术介绍该系统基于浏览器的方式进行访问，采用springboot集成快速开发框架，前端使用vue方式，基于es5的语法，开发工具IntelliJIDEAx64，因为该开发工具，内嵌了Tomcat服务运行机制，可不用单独下载Tomcatserver服务器。由于考虑到
计算机专业大数据毕业设计-基于 Spark 的音乐数据分析项目(源码+LW+部署文档+全bao+远程调试+代码讲解等) 程序猿八哥数据可视化计算机毕设 spark 大数据课程设计 spark
博主介绍：✌️码农一枚，专注于大学生项目实战开发、讲解和毕业文撰写修改等。全栈领域优质创作者，博客之星、掘金/华为云/阿里云/InfoQ等平台优质作者、专注于Java、小程序技术领域和毕业项目实战✌️技术范围：：小程序、SpringBoot、SSM、JSP、Vue、PHP、Java、python、爬虫、数据可视化、大数据、物联网、机器学习等设计与开发。主要内容：免费功能设计，开题报告、任务书、全b
在 Conda 中删除环境及所有安装的库 Studying 开龙wu conda
注意事项1.删除环境前确保你没有在该环境中运行任何程序。2.删除操作是不可逆的，所有该环境中的包和配置都会被永久删除。3.如果你想保留环境的配置信息，可以在删除前使用condaenvexport>environment.yml导出环境配置。关于requirements.txt和environment.yaml文件使用介绍详情可参考以往文章，争对机器学习和深度学习里Python项目开发管理项目依赖的
Java程序设计笔记是程序蜂啊 java 笔记开发语言
Java程序设计目录Java程序设计第一章java语言开发环境1.1工具篇1.2Eclipse调整字体第三章Java基础3.1java基本数据类型3.2关键字与标识符3.3常数3.4变量3.5.数据类型转换3.6由键盘输入数据4.1顺序结构4.2分支语句5.1什么是数组5.2数组赋值：5.3一维数组5.4二维数组6.1类的基本概念6.2定义类6.3对象的创建与使用6.4参数的传递第七章java语言
Javascript 平行四边形周长计算程序(Program for Circumference of a Parallelogram)
给定平行四边形的边，计算周长。示例：输入：a=10，b=8输出：36.00输入：a=25.12，b=20.4输出：91.04平行四边形的对边长度相等且平行。两角相等，但不一定为90度。平行四边形的周长可以计算为两条相邻边之和，每条边乘以2。计算平行四边形周长的公式：（2*a）+（2*b）//JavascriptProgramtocalculatethe//CircumferenceofaParal
什么是Java？想学习却不知道从哪开始？不熬夜不是好程序员
谈起Java，相信有很多小伙伴们也跟我刚开始一样，对他的了解只有难，学成之后工资高，从入门学到入土，但当你真正开始系统的学习之后才发现其实哪些程序猿们也不过尔尔（刚学习完刚入职那种。。。）什么是Java?Java是一门编程语言，Java是一门掌握了技术就可以拿到高薪的工作岗位。Java这个语言在我国发展的很完善，相当于你掌握了Java技术出来，具备一定的开发经验，既可以在一线城市找到合适的岗位工作
绝佳组合 SpringBoot + Lua + Redis = 王炸！
Java精选面试题（微信小程序）：5000+道面试题和选择题，真实面经，简历模版，包含Java基础、并发、JVM、线程、MQ系列、Redis、Spring系列、Elasticsearch、Docker、K8s、Flink、Spark、架构设计、大厂真题等，在线随时刷题！前言曾经有一位魔术师，他擅长将SpringBoot和Redis这两个强大的工具结合成一种令人惊叹的组合。他的魔法武器是Redis的
Python基础（字符串的切片与断言）日暮凡尘 python 开发语言 pycharm
'''1.输入一个字符串，判断是否只包含英文字母（大写或小写）。输出True或False。2.输入一个字符串，统计里面数字字符（0-9）的数量。3.输入两个字符串，第一个是主串，第二个是要查找的字符，判断字符是否在主串中。4.输入一个字符串，将所有数字字符转换成整数后求和。5.统计字符串中空格的数量6.输入字符串和数字n，判断字符串是否只包含数字且长度等于n。7.验证用户输入的手机号格式（中国手机
聊聊flink的RpcService go4it
序本文主要研究一下flink的RpcServiceRpcServiceflink-release-1.7.2/flink-runtime/src/main/java/org/apache/flink/runtime/rpc/RpcService.javapublicinterfaceRpcService{StringgetAddress();intgetPort();CompletableFutu
java--单元测试、内省
junit(单元测试框架)junit要注意的细节：1.如果使用junit测试一个方法的时候，在junit窗口上显示绿条那么代表测试正确，如果是出现了红条，则代表该方法测试出现了异常不通过。2.如果点击方法名、类名、包名、工程名运行junit分别测试的是对应的方法，类、包中的所有类的test方法，工程中的所有test方法。3.@Test测试的方法不能是static修饰与不能带有形参（可以写一个测试方
MySQL(149)如何进行数据清洗？辞暮尔尔-烟火年年 MySQL mysql python 数据库
数据清洗在数据处理和分析过程中至关重要，确保数据质量和一致性。以下是一个详细的指南，展示如何使用Java进行数据清洗，包括处理缺失值、重复值、异常值、数据类型转换以及标准化等步骤。一、准备工作确保安装有Java开发环境（JDK）和Maven或Gradle等依赖管理工具。我们将使用ApacheCommonsCSV库来处理CSV文件，并使用Java标准库进行数据清洗操作。二、加载数据首先，我们加载数据
（详细！！）2024最新Neo4j详细使用指南熊猫发电机：miniqq207 neo4j neo4j
Neo4j详细使用指南一、介绍Neo4j是什么Neo4j是一个高性能的,NOSQL图形数据库，它将结构化数据存储在网络上而不是表中。它是一个嵌入式的、基于磁盘的、具备完全的事务特性的Java持久化引擎，但是它将结构化数据存储在网络(从数学角度叫做图)上而不是表中。Neo4j也可以被看作是一个高性能的图引擎，该引擎具有成熟数据库的所有特性。程序员工作在一个面向对象的、灵活的网络结构下而不是严格、静态
（详细文档）java web在线商城系统（jsp + servlet）熊猫发电机：miniqq207 实训项目数据仓库大数据
目录一、设计任务......................................................................................41.1设计意义................................................................................41.2设计目的..........
mysql事物详解
前言：事物是什么？作为一个java程序员，也许我们仅仅只是停留在会使用的程度上，会通过在类上或者方法上使用@Transactional注解的方式来使用事物，但是背后的原理，为什么使用这个注解就能使事物生效可能并不是很清楚。下面本文详细一一介绍事物是什么，事物的特性，怎么使用等等。1.事物是什么所谓事物，在我的理解中就是一系列操作的一个集合，一旦其中一个操作失败，那么整个操作集合必须全部失败，回滚到
JAVAWeb2 DanB24 oracle 数据库
1.数据库设计1.软件的研发步骤数据库设计概念数据库设计就是根据业务系统的具体需求，结合我们所选用的DBMS，为这个业务系统构造出最优的数据存储模型。建立数据库中的表结构以及表与表之间的关联关系的过程。有哪些表？表里有哪些字段？表和表之间有什么关系？数据库设计的步骤需求分析（数据是什么?数据具有哪些属性?数据与属性的特点是什么）逻辑分析（通过ER图对数据库进行逻辑建模，不需要考虑我们所选用的数据库
java Illegal overloaded getter method with ambiguous type for propert的解决 zwllxs java jdk
好久不来iteye,今天又来看看，哈哈,今天碰到在编码时，反射中会抛出 Illegal overloaded getter method with ambiguous type for propert这么个东东，从字面意思看，是反射在获取getter时迷惑了，然后回想起java在boolean值在生成getter时，分别有is和getter，也许我们的反射对象中就有is开头的方法迷惑了jdk，
IT人应当知道的10个行业小内幕 beijingjava 工作互联网
10. 虽然IT业的薪酬比其他很多行业要好，但有公司因此视你为其“佣人”。　　尽管IT人士的薪水没有互联网泡沫之前要好，但和其他行业人士比较，IT人的薪资还算好点。在接下的几十年中，科技在商业和社会发展中所占分量会一直增加，所以我们完全有理由相信，IT专业人才的需求量也不会减少。　　然而，正因为IT人士的薪水普遍较高，所以有些公司认为给了你这么多钱，就把你看成是公司的“佣人”，拥有你的支配
java 实现自定义链表 CrazyMizzz java 数据结构
1.链表结构链表是链式的结构 2.链表的组成链表是由头节点，中间节点和尾节点组成节点是由两个部分组成： 1.数据域 2.引用域 3.链表的实现 &nbs
web项目发布到服务器后图片过一会儿消失麦田的设计者 struts2 上传图片永久保存
作为一名学习了android和j2ee的程序员，我们必须要意识到，客服端和服务器端的交互是很有必要的，比如你用eclipse写了一个web工程，并且发布到了服务器（tomcat）上，这时你在webapps目录下看到了你发布的web工程，你可以打开电脑的浏览器输入http://localhost:8080/工程/路径访问里面的资源。但是，有时你会突然的发现之前用struts2上传的图片
CodeIgniter框架Cart类 name 不能设置中文的解决方法 IT独行者 CodeIgniter Cart 框架　
今天试用了一下CodeIgniter的Cart类时遇到了个小问题，发现当name的值为中文时，就写入不了session。在这里特别提醒一下。在CI手册里也有说明，如下： $data = array( 'id' => 'sku_123ABC', 'qty' => 1, '
linux回收站 _wy_ linux 回收站
今天一不小心在ubuntu下把一个文件移动到了回收站，我并不想删，手误了。我急忙到Nautilus下的回收站中准备恢复它，但是里面居然什么都没有。后来我发现这是由于我删文件的地方不在HOME所在的分区，而是在另一个独立的Linux分区下，这是我专门用于开发的分区。而我删除的东东在分区根目录下的.Trash-1000/file目录下，相关的删除信息（删除时间和文件所在
jquery回到页面顶端知了ing html jquery css
html代码： <h1 id="anchor">页面标题</h1> <div id="container">页面内容</div> <p><a href="#anchor" class="topLink">回到顶端</a><
B树、B-树、B+树、B*树矮蛋蛋 B树
原文地址： http://www.cnblogs.com/oldhorse/archive/2009/11/16/1604009.html B树即二叉搜索树： 1.所有非叶子结点至多拥有两个儿子（Left和Right）； &nb
数据库连接池 alafqq 数据库连接池
http://www.cnblogs.com/xdp-gacl/p/4002804.html @Anthor:孤傲苍狼数据库连接池用MySQLv5版本的数据库驱动没有问题，使用MySQLv6和Oracle的数据库驱动时候报如下错误： java.lang.ClassCastException: $Proxy0 cannot be cast to java.sql.Connec
java泛型百合不是茶 java泛型
泛型在Java SE 1.5之前，没有泛型的情况的下，通过对类型Object的引用来实现参数的“任意化”，任意化的缺点就是要实行强制转换，这种强制转换可能会带来不安全的隐患泛型的特点：消除强制转换确保类型安全向后兼容简单泛型的定义：泛型：就是在类中将其模糊化，在创建对象的时候再具体定义 class fan
javascript闭包[两个小测试例子] bijian1013 JavaScript JavaScript
一.程序一 <script> var name = "The Window"; var Object_a = { 　　name : "My Object", 　　getNameFunc : function(){ var that = this; 　　　　return function(){ 　　　　
探索JUnit4扩展：假设机制（Assumption） bijian1013 java Assumption JUnit 单元测试
一.假设机制（Assumption）概述理想情况下，写测试用例的开发人员可以明确的知道所有导致他们所写的测试用例不通过的地方，但是有的时候，这些导致测试用例不通过的地方并不是很容易的被发现，可能隐藏得很深，从而导致开发人员在写测试用例时很难预测到这些因素，而且往往这些因素并不是开发人员当初设计测试用例时真正目的，
【Gson四】范型POJO的反序列化 bit1129 POJO
在下面这个例子中，POJO(Data类)是一个范型类，在Tests中，指定范型类为PieceData，POJO初始化完成后，通过 String str = new Gson().toJson(data); 得到范型化的POJO序列化得到的JSON串，然后将这个JSON串反序列化为POJO import com.google.gson.Gson; import java.
【Spark八十五】Spark Streaming分析结果落地到MySQL bit1129 Stream
几点总结： 1. DStream.foreachRDD是一个Output Operation，类似于RDD的action，会触发Job的提交。DStream.foreachRDD是数据落地很常用的方法 2. 获取MySQL Connection的操作应该放在foreachRDD的参数（是一个RDD[T]=>Unit的函数类型)，这样，当foreachRDD方法在每个Worker上执行时，
NGINX + LUA实现复杂的控制 ronin47 nginx lua
安装lua_nginx_module 模块 lua_nginx_module 可以一步步的安装，也可以直接用淘宝的OpenResty Centos和debian的安装就简单了。。这里说下freebsd的安装： fetch http://www.lua.org/ftp/lua-5.1.4.tar.gz tar zxvf lua-5.1.4.tar.gz cd lua-5.1.4 ma
java-递归判断数组是否升序 bylijinnan java
public class IsAccendListRecursive { /*递归判断数组是否升序 * if a Integer array is ascending,return true * use recursion */ public static void main(String[] args){ IsAccendListRecursiv
Netty源码学习-DefaultChannelPipeline2 bylijinnan java netty
Netty3的API http://docs.jboss.org/netty/3.2/api/org/jboss/netty/channel/ChannelPipeline.html 里面提到ChannelPipeline的一个“pitfall”：如果ChannelPipeline只有一个handler（假设为handlerA）且希望用另一handler（假设为handlerB）来
Java工具之JPS chinrui java
JPS使用熟悉Linux的朋友们都知道，Linux下有一个常用的命令叫做ps（Process Status)，是用来查看Linux环境下进程信息的。同样的，在Java Virtual Machine里面也提供了类似的工具供广大Java开发人员使用，它就是jps（Java Process Status)，它可以用来
window.print分页打印 ctrain window
function init() { var tt = document.getElementById("tt"); var childNodes = tt.childNodes[0].childNodes; var level = 0; for (var i = 0; i < childNodes.length; i++) {
安装hadoop时执行jps命令Error occurred during initialization of VM daizj jdk hadoop jps
在安装hadoop时，执行JPS出现下面错误 [slave16][email protected]:/tmp/hsperfdata_hdfs# jps Error occurred during initialization of VM java.lang.Error: Properties init: Could not determine current working
PHP开发大型项目的一点经验 dcj3sjt126com PHP 重构
一、变量最好是把所有的变量存储在一个数组中，这样在程序的开发中可以带来很多的方便，特别是当程序很大的时候。变量的命名就当适合自己的习惯，不管是用拼音还是英语，至少应当有一定的意义，以便适合记忆。变量的命名尽量规范化，不要与PHP中的关键字相冲突。二、函数 PHP自带了很多函数，这给我们程序的编写带来了很多的方便。当然，在大型程序中我们往往自己要定义许多个函数，几十
android笔记之--向网络发送GET/POST请求参数 dcj3sjt126com android
使用GET方法发送请求 private static boolean sendGETRequest (String path, Map<String, String> params) throws Exception{ //发送地http://192.168.100.91:8080/videoServi
linux复习笔记之bash shell (3) 通配符 eksliang linux 通配符 linux通配符
转载请出自出处： http://eksliang.iteye.com/blog/2104387 在bash的操作环境中有一个非常有用的功能，那就是通配符。下面列出一些常用的通配符，如下表所示符号意义 * 万用字符，代表0个到无穷个任意字符 ? 万用字符，代表一定有一个任意字符 [] 代表一定有一个在中括号内的字符。例如：[abcd]代表一定有一个字符，可能是a、b、c
Android关于短信加密 gqdy365 android
关于Android短信加密功能，我初步了解的如下（只在Android应用层试验）： 1、因为Android有短信收发接口，可以调用接口完成短信收发；发送过程：APP（基于短信应用修改）接受用户输入号码、内容——>APP对短信内容加密——>调用短信发送方法Sm
asp.net在网站根目录下创建文件夹 hvt .net C#hovertree asp.net Web Forms
假设要在asp.net网站的根目录下建立文件夹hovertree,C#代码如下： string m_keleyiFolderName = Server.MapPath("/hovertree"); if (Directory.Exists(m_keleyiFolderName)) { //文件夹已经存在 return; } else { try { D
一个合格的程序员应该读过哪些书 justjavac 程序员书籍
编者按：2008年8月4日，StackOverflow 网友 Bert F 发帖提问：哪本最具影响力的书，是每个程序员都应该读的？ “如果能时光倒流，回到过去，作为一个开发人员，你可以告诉自己在职业生涯初期应该读一本，你会选择哪本书呢？我希望这个书单列表内容丰富，可以涵盖很多东西。” 很多程序员响应，他们在推荐时也写下自己的评语。以前就有国内网友介绍这个程序员书单，不过都是推荐数
单实例实践跑龙套_az 单例
1、内部类 public class Singleton { private static class SingletonHolder { public static Singleton singleton = new Singleton(); } public Singleton getRes
PO VO BEAN 理解 q137681467 VO DTO po
PO：全称是 persistant object持久对象最形象的理解就是一个PO就是数据库中的一条记录。好处是可以把一条记录作为一个对象处理，可以方便的转为其它对象。 BO：全称是 business object:业务对象主要作用是把业务逻辑封装为一个对象。这个对
战胜惰性，暗自努力金笛子努力
偶然看到一句很贴近生活的话：“别人都在你看不到的地方暗自努力，在你看得到的地方，他们也和你一样显得吊儿郎当，和你一样会抱怨，而只有你自己相信这些都是真的，最后也只有你一人继续不思进取。”很多句子总在不经意中就会戳中一部分人的软肋，我想我们每个人的周围总是有那么些表现得“吊儿郎当”的存在，是否你就真的相信他们如此不思进取，而开始放松了对自己的要求随波逐流呢？我有个朋友是搞技术的，平时嘻嘻哈哈，以
NDK/JNI二维数组多维数组传递 wenzongliang 二维数组 jni NDK
多维数组和对象数组一样处理，例如二维数组里的每个元素还是一个数组用jArray表示，直到数组变为一维的，且里面元素为基本类型，去获得一维数组指针。给大家提供个例子。已经测试通过。 Java_cn_wzl_FiveChessView_checkWin( JNIEnv* env,jobject thiz,jobjectArray qizidata) { jint i,j; int s

第05章 深度卷积神经网络模型

序言