行走的算法

Pytorch构建神经网络(二)笔记

Pytorch构建神经网络（二）笔记

3. 神经网络与深度学习
- 3.1 Fashion-MNIST 数据集的起源
- - 3.1.1 Fashion-MNIST数据集
- 3.2 使用torchvision导入和加载数据集
- - 3.2.1 创建深度学习项目的流程：
  - 3.2.2 数据准备遵守ETL过程：
  - 3.2.3 数据的准备：
- 3.3 数据集的访问
- - 3.3.1 不平衡数据集
- 3.4 网络建立
- - 3.4.1 class和object的区分
  - 3.4.2 类和实例(对象)

3. 神经网络与深度学习

3.1 Fashion-MNIST 数据集的起源

计算机程序一般由两个主要部分组成：代码和数据
对于深度学习而言，软件即为网络本身，尤其是在训练过程中通过训练产生的权重
神经网络程序员的工作是通过训练来监督和指导学习过程(可以看做是编写软件或代码的间接方式)

3.1.1 Fashion-MNIST数据集

MNIST是非常著名的手写数字数据集 (M:Modify; NIST: National Institute of Standard and Technology)
MNIST中共有7万张图像：6万张用于训练；1万张用于测试；共0—9十个类别
Fashion-MNIST数据集来自Zalando网站：10类别对应10种服饰；7万张 28x28的灰度图像
Fashion-MNIST的目的是取代MNIST数据集，用作基准来测试机器学习算法
Fashion-MNIST与MNIST数据集的异同：(1)异：MNIST数据集中图像都是手写图像，而Fashion-MNIST中的是真实图像；(2)同：这两个数据集具有相同的数据规模，图像大小，数据格式，以及训练集和测试集的分割方式
MNIST为何如此受欢迎：1.该数据集的规模允许深度学习研究者快速地检查和复现它们的算法；2.在所有的深度学习框架中都能使用该数据集
Pytorch中的torchvision包可以加载fashion-mnist数据集

3.2 使用torchvision导入和加载数据集

3.2.1 创建深度学习项目的流程：

准备数据集
创建网络模型
训练网络模型
分析结果

3.2.2 数据准备遵守ETL过程：

提取(extract)、转换(transform)、加载(load)
pytorch中自带的包，能够将ETL过程变得简单

3.2.3 数据的准备：

1.提取：从源数据中获取fashion-mnist图像数据
2.转换：将数据转换成张量的形式
3.加载：将数据封装成对象，使其更容易访问
Fashion-MNIST 与 MNIST数据集在调用上最大的不同就是URL的不同
torch.utils.data.Dataset:一个用于表示数据集的抽象类
torch.utils.data.DataLoader: 包装数据集并提供对底层的访问

import torch
import torchvision
import torchvision.transforms as transforms # 可帮助对数据进行转换

train_set = torchvision.datasets.FashionMNIST(
    root = './data/FashionMNIST',   # 数据集在本地的存储位置
    train = True,                   # 数据集用于训练
    download = True,                # 如果本地没有数据，就自动下载
    transform = transforms.Compose([
        transforms.ToTensor()         
    ])                              # 将图像转换成张量
)

train_loader = torch.utils.data.DataLoader(train_set)
# 训练集被打包或加载到数据加载器中，可以以我们期望的格式来访问基础数据；
# 数据加载器使我们能够访问数据并提供查询功能

报错：

D:\Anaconda3_install\envs\pytorch_1.9\lib\site-packages\torchvision\datasets\mnist.py:498: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at  ..\torch\csrc\utils\tensor_numpy.cpp:180.)
  return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)

解决方法：修改 mnist.py 文件
Pytorch | 报错The given NumPy array is not writeable,and PyTorch does not support non-writeable tensor
这个链接解决了我的报错

我的解决方法:
点击报错，直接能跳转到minst.py中，并且能直接跳转到copy=False的位置，然后删除copy=False即可。

Bach_size大小的调整
讲解了Bach_size大小的相关意义。
神经网络中batch_size参数的含义及设置方法
讲解了一些设置的技巧和优缺点

3.3 数据集的访问

import torch
import torchvision
import torchvision.transforms as transforms  # 可帮助对数据进行转换
import numpy as np
import matplotlib.pyplot as plt

train_set = torchvision.datasets.FashionMNIST(
    root = './data/FashionMNIST',   # 数据集在本地的存储位置
    train = True,                   # 数据集用于训练
    download = True,                # 如果本地没有数据，就自动下载
    transform = transforms.Compose([
        transforms.ToTensor()
    ])                              # 将图像转换成张量
)

num_workers = 4  # 指定进程数为4

train_loader = torch.utils.data.DataLoader(train_set, batch_size=10)
# 训练集被打包或加载到数据加载器中，可以以我们期望的格式来访问基础数据；
# 数据加载器使我们能够访问数据并提供查询功能

torch.set_printoptions(linewidth=120)     # 设置打印行宽
print(len(train_set))
print(train_set.train_labels)
print(train_set.train_labels.bincount())    # bincount:张量中每个值出现的频数

out：
60000
tensor([9, 0, 0,  ..., 3, 0, 5])
tensor([6000, 6000, 6000, 6000, 6000, 6000, 6000, 6000, 6000, 6000])

# 查看单个样本
sample = next(iter(train_set))
print(len(sample))
print(type(sample))

out:
torch.Size([1, 28, 28])

# 显示图像和标签
plt.imshow(image.squeeze(), cmap='gray')    # 将[1, 28, 28]->[28,28]
plt.show()
print('label:', label)

# 查看批量样本
batch= next(iter(train_loader))
print(len(batch))
print(type(batch))
images, labels = batch
print(images.shape)
print(labels.shape)

out:
2
<class 'list'>
torch.Size([10, 1, 28, 28])
torch.Size([10])

# 画出一批的图像
grid= torchvision.utils.make_grid(images,nrow =10)
print(grid.shape)
plt.figure(figsize=(15, 15))
plt.imshow(np.transpose(grid,(1,2,0)))   # 将张量转换成矩阵
print('labels:', labels)
# 可以通过改变batchsize来显示更多的数据

out:
torch.Size([3, 32, 302])
labels: tensor([9, 0, 0, 3, 0, 2, 7, 2, 5, 5])

其中：

grid = torchvision.utils.make_grid(images, nrow=10)
#make_grid的作用是将若干幅图像拼成一幅图像。images是所有的图片集（需要在之前定义），nrow的作用是一行多少张图片（images的数量/nrow=行数）,其中padding的作用就是子图像与子图像之间的pad有多宽。

例如：

# 画出一批的图像
grid = torchvision.utils.make_grid(images, nrow=3)
print(grid.shape)
plt.figure(figsize=(15, 15))
plt.imshow(np.transpose(grid,(1,2,0)))   # 将张量转换成矩阵
plt.show()
print('labels:', labels)
# 可以通过改变batchsize来显示更多的数据

numpy.transpose函数的作用：调整数组的行列值的索引值
例如（0，1，2）对应（x,y,z）。我们可以使用这个函数调整为np.transpose(grid,(1,2,0))——（y,z,x）

Python numpy.transpose 详解
http://www.360doc.com/content/19/0602/00/7669533_839717717.shtml

3.3.1 不平衡数据集

关于数据不均衡的问题可以读文章：A systematic study of the class imbalance problem in convolutional neural networks

3.4 网络建立

3.4.1 class和object的区分

class 就是一个实际对象的蓝图或描述
object 就是事物本身
创建的对象需要在类的实例中调用对象
一个给定类的所有实例都有两个核心组件：方法和属性
方法代表代码，属性代表数据；方法和属性是由类定义的
属性用于描述对象的特征；方法用于描述对象的行为，即对象能够做什么
在一个项目中可以有许多对象，即给定类的实例可以同时存在（可在一个类中创建多个对象）
类用于封装方法和属性

3.4.2 类和实例(对象)

类是抽象的模板，用于表述具有相同属性和方法的对象的集合，类的命名尽量见名知意
对象是真实的，见得到摸得着的东西
类的定义：class 类名():
类的组成：类名；属性（一组数据）；方法（允许进行的操作）

# 类的创建
class Lizard:
    def __init__(self, name):    # 创建对象时自动运行，不用额外调用,无返回值
        self.name = name
    def set_name(self, name):
        self.name = name
# 类的调用
lizard = Lizard('deep')
print(lizard.name)
lizard.set_name('lizard')
print(lizard.name)

out:
deep
lizard

3.4.3 面向对象编程与pytorch的结合

构建一个神经网络的主要组件是层（pytorch神经网络库中包含了帮助构造层的类）
神经网络中的每一层都有两个主要组成部分：转换和权重（转换代表代码；权重代表数据）
forward方法（前向传输）：张量通过每层的变换向前流动，直到达到输出层
构建神经网络时必须提供前向方法，前向方法即为实际的变换
使用pytorch创建神经网络的步骤：
1.扩展nn.Module基类
2.定义层(layers)为类属性
3.实现前向方法

# CNN网络的建立
import torch.nn as nn
class Network(nn.Module):   #()中加入nn.Module可以使得Network类继承Module基类中的所有功能
    def __init__(self):
        super(Network, self).__init__()     # 对继承的父类的属性进行初始化，使用父类的方法来进行初始化
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
        self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
        
        self.fc1 = nn.Linear(in_features=12*4*4, out_features=120)        # 从卷积层传入线性层需要对张量flatten
        self.fc2 = nn.Linear(in_features=120, out_features=60)
        self.out = nn.Linear(in_features=60, out_features=10)
    def forward(self, t):
        # implement the forward pass
        return t
network = Network()     # 创建网络对象network
print(network)



out:
Network(
  (conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(6, 12, kernel_size=(5, 5), stride=(1, 1))
  (fc1): Linear(in_features=192, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=60, bias=True)
  (out): Linear(in_features=60, out_features=10, bias=True)
)

3.5 CNN构建及网络参数的使用

在上述的Network类中，我们定义了两个卷积层和三个线性层；两个主要的部分封装在其中，即前向函数的定义和权重张量；每个层中权重张量包含了随着我们的网络在训练过程中学习而更新的权重值（这就是在网络类中将层定义为类属性的原因）；在Module类中，pytorch可以跟踪每一层的权重张量，由于我们在创建Network类时扩展了Module类，也就自动继承了该功能。

Parameter和Argument的区别：
Parameter在函数定义中使用，可将其看作是占位符；(形参)
Argument是当函数被调用时传递给函数的实际值；（实参）
Parameter的两种类型：
1.Hyperparameters（超参数）:其值是手动和任意确定的；要构建神经网络：kernel_size, out_channels, out_features都需要手动选择

2.Data dependent Hyperparameters:其值是依赖于数据的参数

该参数位于网络的开始或末端，即第一个卷积层的输入通道和最后一个卷积层的输出特征图

第一个卷积层的输入通道依赖于构成训练集的图像内部的彩色通道的数量（灰度图像是1，彩色图像是3）
输出层的输出特征依赖于训练集中类的数量（fashion-MNIST数据集中的类型为10，则输出层的out_features=10）
通常情况下，一层的输入是上一层的输出（即：卷积层中所有输入通道和线性层中的输入特征都依赖于上一层的数据）

当张量从卷积层传入线性层时，张量必须是flatten的

3.6 CNN的权重

可学习参数：是在训练过程中学习的参数，初值是选择的任意值，其值在网络学习的过程中以迭代的方式进行更新
说网络在学习是指：网络在学习参数的适合的值，适合的值就是能使损失函数最小化的值
可学习的参数是网络的权重，存在于每一层中
当我们扩展类的时候，我们会得到它的所有功能，为了得到它，我们可以添加额外的功能，也可覆盖现有的功能：def repr(self):
在python中，所有特殊的面向对象的方法通常都有前双下划线和后双下划线（init, repr）

# Network类没有扩展Module基类:class Network()缺少nn.Model和 #super(Network, self).__init__()
import torch.nn as nn
class Network():
    def __init__(self):
        #super(Network, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
        self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
        self.fc1 = nn.Linear(in_features=12*4*4, out_features=120)
        self.fc2 = nn.Linear(in_features=120, out_features=60)
        self.out = nn.Linear(in_features=60, out_features=10)
    def forward(self,t):
        # implement the forward pass
        return t

network = Network()     # 创建网络对象network
print(network)


out:
# 下面是python的默认的字符串表示的输出
<__main__.Network at 0x22587c614a8>

如下所示，在未扩展module时，可使用repr函数实现正常输出

import torch.nn as nn
class Network():
    def __init__(self):
        #super(Network, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
        self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
        self.fc1 = nn.Linear(in_features=12*4*4, out_features=120)
        self.fc2 = nn.Linear(in_features=120, out_features=60)
        self.out = nn.Linear(in_features=60, out_features=10)
    def forward(self,t):
        # implement the forward pass
        return t
        
    # 用于重写python的默认字符串表示
    def __repr__(self):
        return "lizard"
network = Network()     # 创建网络对象network
print(network)


out:
lizard

视频中输出网络参数的代码为：

import torch.nn as nn
class Network(nn.Module):
    def __init__(self):
        super(Network, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
        self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
        self.fc1 = nn.Linear(in_features=12*4*4, out_features=120)
        self.fc2 = nn.Linear(in_features=120, out_features=60)
        self.out = nn.Linear(in_features=60, out_features=10)
    def forward(self,t):
        # implement the forward pass
        return t

network = Network()     # 创建网络对象network
print(network)

out:
Network(
  (conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(6, 12, kernel_size=(5, 5), stride=(1, 1))
  (fc1): Linear(in_features=192, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=60, bias=True)
  (out): Linear(in_features=60, out_features=10, bias=True)
)

可使用点符号来访问指定的层

print(network.conv1)

out:
Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))

# 输出conv1的权重
print(network.conv1.weight)

out:
Parameter containing:
tensor([[[[-0.1438, -0.1988,  0.1899, -0.1422,  0.1970],
          [ 0.1218,  0.1801,  0.0804,  0.1110, -0.1473],
          [-0.1049, -0.1533,  0.0420,  0.1099, -0.1373],
          [ 0.1582, -0.0019, -0.0629,  0.0914, -0.0435],
          [-0.1514, -0.0354, -0.1848, -0.0231,  0.1370]]],


        [[[ 0.0317, -0.1364,  0.1620,  0.1353, -0.1444],
          [ 0.0680,  0.1570,  0.0125,  0.0637, -0.0675],
          [-0.1313, -0.1136,  0.1897,  0.1206, -0.0622],
          [-0.1080, -0.0497, -0.0702, -0.0526, -0.1793],
          [ 0.0029,  0.1846, -0.0085,  0.0482, -0.0998]]],


        [[[-0.0316,  0.0776, -0.0835,  0.1112,  0.0020],
          [-0.0056, -0.1553, -0.1064,  0.1666,  0.1231],
          [ 0.1483,  0.1326,  0.0449,  0.0727, -0.0959],
          [ 0.1752, -0.1934,  0.0086,  0.1932, -0.0894],
          [ 0.0845,  0.0121, -0.1207,  0.0316, -0.1766]]],


        [[[ 0.0294,  0.1874, -0.1835,  0.0130, -0.0245],
          [-0.0159, -0.1468, -0.0155, -0.0169, -0.0171],
          [-0.1077, -0.1065, -0.1337, -0.1069, -0.1904],
          [-0.1552, -0.1737, -0.0083,  0.1185,  0.0473],
          [ 0.0124,  0.0715, -0.1177, -0.0071, -0.0533]]],


        [[[ 0.0202,  0.0005, -0.1567, -0.0514, -0.1844],
          [ 0.1773,  0.0434, -0.0500, -0.0931, -0.0610],
          [-0.0461,  0.0202, -0.1609,  0.1488, -0.1418],
          [ 0.1540,  0.0594,  0.0386, -0.0253,  0.1520],
          [ 0.1568,  0.0054,  0.0918,  0.0434, -0.0474]]],


        [[[-0.0508,  0.1441, -0.0893, -0.1571,  0.1605],
          [-0.0918, -0.0100,  0.0122, -0.1781, -0.0800],
          [-0.1800, -0.0535,  0.0338, -0.1285,  0.0770],
          [ 0.0650,  0.1575,  0.1226,  0.1950, -0.0195],
          [-0.1236, -0.0997,  0.0097, -0.0187, -0.1009]]]], requires_grad=True)

# 输出conv1权重的形状
print(network.conv1.weight.shape)

out:
# 第一个参数6代表滤波器的数量，第二个参数1代表输入的通道数量，第三、四代表滤波器的高度和宽度
torch.Size([6, 1, 5, 5])

我们可以把任何一个滤波器单独拉出来，通过索引到权重张量的第一个轴上

print(network.conv1.weight[0].shape)

out:
# 深度是1，高度和宽度是5
torch.Size([1, 5, 5])

# 对于全连接层，由于需要flatten的张量输入，故此时的权重张量是个秩为2的高度、宽度轴
print(network.fc1.weight.shape)   # height=>out_features; width=>in_features

out:
#  self.fc1 = nn.Linear(in_features=12*4*4, out_features=120)
#  (fc1): Linear(in_features=192, out_features=120, bias=True)
# 从下面的输出，我们可以看出，这里的模式是高度的长度等于期望的输出特征的长度，宽度的长度等于输入特征的长度
torch.Size([120, 192])

这是因为矩阵乘法的特点导致的：

为了追踪网络中的所有权重张量，pytorch有一个叫Parameter的类，该类扩展了Tensor类，所以每一层的权重张量就是这个参数类的一个实例
权重矩阵定义了线性函数(线性映射)

# 张量的乘法
in_features = torch.tensor([1,2,3,4],dtype=torch.float32)
weight_matrix = torch.tensor([
    [1,2,3,4],
    [2,3,4,5],
    [3,4,5,6]
], dtype=torch.float32)
print(weight_matrix.matmul(in_features))     # matmul: matrix multiply


out:
tensor([30., 40., 50.])

# CNN网络的建立
import torch.nn as nn
class Network(nn.Module):   #()中加入nn.Module可以使得Network类继承Module基类中的所有功能
    def __init__(self):
        super(Network, self).__init__()     # 对继承的父类的属性进行初始化，使用父类的方法来进行初始化
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
        self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
        
        self.fc1 = nn.Linear(in_features=12*4*4, out_features=120)        # 从卷积层传入线性层需要对张量flatten
        self.fc2 = nn.Linear(in_features=120, out_features=60)
        self.out = nn.Linear(in_features=60, out_features=10)
    def forward(self, t):
        # implement the forward pass
        return t

network = Network()     # 创建网络对象network
print(network)

out:
Network(
  (conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(6, 12, kernel_size=(5, 5), stride=(1, 1))
  (fc1): Linear(in_features=192, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=60, bias=True)
  (out): Linear(in_features=60, out_features=10, bias=True)
)

访问所有的参数

# 访问所有的参数
# 方法1：
for param in network.parameters():
    print(param.shape)

out:
torch.Size([6, 1, 5, 5])
torch.Size([6])
torch.Size([12, 6, 5, 5])
torch.Size([12])
torch.Size([120, 192])
torch.Size([120])
torch.Size([60, 120])
torch.Size([60])
torch.Size([10, 10])
torch.Size([10])

# 方法2： 
for name, param in network.named_parameters():
    print(name,'\t\t', param.shape)

out:
conv1.weight 		 torch.Size([6, 1, 5, 5])
conv1.bias 		 torch.Size([6])
conv2.weight 		 torch.Size([12, 6, 5, 5])
conv2.bias 		 torch.Size([12])
fc1.weight 		 torch.Size([120, 192])
fc1.bias 		 torch.Size([120])
fc2.weight 		 torch.Size([60, 120])
fc2.bias 		 torch.Size([60])
out.weight 		 torch.Size([10, 10])
out.bias 		 torch.Size([10])

3.7 pytorch可调用模块

3.7.1 Linear的工作原理

# 1. 张量的乘法
in_features = torch.tensor([1,2,3,4], dtype=torch.float32)
weight_matrix = torch.tensor([
    [1,2,3,4],
    [2,3,4,5],
    [3,4,5,6]
], dtype = torch.float32)
print(weight_matrix.matmul(in_features))
# 可将上述的权重矩阵看作是一个线性映射（函数），其实现过程与pytorch中的线性层一样


out：
tensor([30., 40., 50.])

# 2. 线性层
fc = nn.Linear(in_features=4, out_features=3)
# pytorch 线性层通过将数字4和3传递给构造函数，以创建一个3x4的权重矩阵
# 查看in_features张量
print(fc(in_features))
# 此时的结果与上述不同是因为这里的weight_matrix是由随机值来初始化的


out：
tensor([-0.4276, -1.8520,  2.3740], grad_fn=)

# 在parameter类中包装一个权重矩阵，以使得输出结果与1中一样
fc = nn.Linear(in_features=4, out_features=3)
fc.weight= nn.Parameter(weight_matrix)
print(fc(in_features))
# 此时的结果接近1中的结果却不精确，是因为由bias的存在


out：
tensor([30.4195, 40.2070, 50.1337], grad_fn=)

# 给bias传递一个false值，以得到精确的输出
fc = nn.Linear(in_features=4, out_features=3, bias =False)
fc.weight = nn.Parameter(weight_matrix)
print(fc(in_features))


out：
tensor([30., 40., 50.], grad_fn=)

线性转换的数学表示：
y = Ax + b
A: 权重矩阵张量
x: 输入张量
b: 权重张量
y: 输出张量

3.7.2 特殊的调用

讲解了代码调用的内部细节，没有看

3.8 CNN前向方法的实现

前向方法的实现将使用我们在构造函数中定义的所有层
前向方法实际上是输入张量到预测的输出张量的映射

3.8.1 Input Layer¶

输入层是由输入数据决定的
输入层可以看做是恒等变换 f(x)=x
输入层通常是隐式存在的

import torch.nn as nn
import torch.nn.functional as F
class Network(nn.Module):
    def __init__(self):
        super(Network, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
        self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
        self.fc1 = nn.Linear(in_features=12*4*4, out_features=120)
        self.fc2 = nn.Linear(in_features=120,out_features=60)
        self.out = nn.Linear(in_features=60, out_features=10)
    def forward(self,t):
        # (1) input layer
        t = t

        # (2) hidden conv layer1
        t = self.conv1(t)
        t = F.relu(t)
        t = F.max_pool2d(t, kernel_size=2, stride=2)

        # (3) hidden conv layer2
        t = self.conv2(t)
        t = F.relu(t)
        t = F.max_pool2d(t, kernel_size=2, stride=2)
        # relu 和 max pooling 都没有权重；激活层和池化层的本质都是操作而非层；层与操作的不同之处在于，层有权重，操作没有

        #（4）hidden linear layer2
        t = t.reshape(-1, 12*4*4)
        t = self.fc1(t)
        t = F.relu(t)

        # (5) hidden linear layer2
        t = self.fc2(t)
        t = F.relu(t)

        # (6) output layer
        t = self.out(t)
        # t= F.softmax(t, dim=1)  # 这里暂不使用softmax，在训练中使用交叉熵损失可隐式的表示softmax
        # 在隐藏层中，通常使用relu作为非线性激活函数
        # 在输出层，有类别要预测时，使用

        return t

3.9 单张图像的预测

3.9.1 前向传播(forward propagation)

是将输入张量转换为输出张量的过程（即：神经网络是将输入张量映射到输出张量的函数）
前向传播只是将输入张量传递给网络并从网络接收输出的过程的一个特殊名称

3.9.2 反向传播(back propagation)

反向传播通常在前向传播后发生
使用torch.set_grad_enabled(False)来关闭pytorch的梯度计算，这将阻止pytorch在我们的张量通过网络时构建一个计算图（关闭是因为我们这里还没有进行训练，只是看随机初始化的网络）
计算图通过跟踪张量在网络中传播的每一个计算，来跟踪网络的映射；然后在训练过程中使用这个图来计算导数，也就是损失函数的梯度；关闭并非强制的，但可以减少内存。

# 单张图像预测
import torch
import torch.nn as nn
import torch.nn.functional as F

import torchvision
import torchvision.transforms as transforms

# 设置打印格式
torch.set_printoptions(linewidth=120)

# 一、数据准备
train_set = torchvision.datasets.FashionMNIST(
    root = './data/FashionMNIST'
    ,train = True
    ,download = True
    , transform = transforms.Compose([
        transforms.ToTensor()
    ])
)
# 二、创建网络
class Network(nn.Module):
    def __init__(self):
        super(Network, self).__init__()
        self.conv1 = nn.Conv2d(in_channels = 1, out_channels=6, kernel_size=5)
        self.conv2 = nn.Conv2d(in_channels = 6, out_channels=12, kernel_size=5)
        self.fc1 = nn.Linear(in_features = 12*4*4, out_features=120)
        self.fc2 = nn.Linear(in_features = 120, out_features = 60)
        self.out = nn.Linear(in_features = 60, out_features=10)
    def forward(self, t):
        # （1）Input Layer
        t = t
        # (2) hidden conv1
        t = self.conv1(t)
        t = F.relu(t)
        t = F.max_pool2d(t, kernel_size=2, stride=2)
        # (3) hidden conv2 
        t = self.conv2(t)
        t = F.relu(t)
        t = F.max_pool2d(t, kernel_size=2, stride=2)
        # (4) hidden linear1
        t = t.reshape(-1, 12*4*4)
        t = self.fc1(t)
        t = F.relu(t)
        # (5) hidden linear2
        t = self.fc2(t)
        t = F.relu(t)
        # (6) output
        t = self.out(t)
        return t
        
# 调用network实例
torch.set_grad_enabled(False)    #关闭pytorch的梯度计算
network = Network()
sample = next(iter(train_set))
image, label = sample
print(image.shape)
# 显示图像和标签
#plt.imshow(image.squeeze(), cmap='gray')    # 将[1, 28, 28]->[28,28]
#print('label:', label)
# 如上我们得到的图像的形状为[1,28,28];而网络期望的张量是【batchsize,channels, height, width】
# 需要使用unsqueeze方法来为其增加一个维度
print(image.unsqueeze(0).shape)
# 对单张图像进行预测
pred = network(image.unsqueeze(0))
print(pred.shape)
print(pred.argmax(dim=1))
print(label)

out:
torch.Size([1, 28, 28])
torch.Size([1, 1, 28, 28])
torch.Size([1, 10])      # 一个预测图像，10种预测结果
tensor([2])
9

# 要想将预测值用概率表示，可以使用softmax
print(F.softmax(pred, dim=1))
print(F.softmax(pred, dim=1).sum())

out:
# 这个预测是不准确的，因为我们的权重还没有训练，这只是随机初始化权重得到的结果
tensor([[0.1052, 0.0973, 0.0985, 0.1051, 0.1061, 0.0883, 0.0925, 0.0925, 0.1145, 0.1000]])
# 所有类的预测概率和为1
tensor(1.0000)

3.10 单批次图像预测

import torch
import torch.nn as nn
import torch.nn.functional as F

import torchvision
import torchvision.transforms as transforms

torch.set_printoptions(linewidth=120)

print(torch.__version__)
print(torchvision.__version__)

out:
1.9.0
0.10.0

# 数据准备
train_set = torchvision.datasets.FashionMNIST(
    root = './data/FashionMNIST'
    ,train = True
    ,download = True
    ,transform = transforms.Compose([
        transforms.ToTensor()
    ]))

# 网络创建
class Network(nn.Module):
    def __init__(self):
        super(Network, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
        self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
        self.fc1 = nn.Linear(in_features=12*4*4, out_features=120)
        self.fc2 = nn.Linear(in_features=120, out_features=60)
        self.out = nn.Linear(in_features=60, out_features=10)
    def forward(self, t):
        #super(Network, self).__init__()
        #(1)Input Layer
        t = t
        #(2)Conv1
        t = F.relu(self.conv1(t))
        t = F.max_pool2d(t, kernel_size=2, stride=2)
        #(3)Conv2
        t = F.relu(self.conv2(t))
        t = F.max_pool2d(t, kernel_size=2, stride=2)
        #(4)FC1
        t = t.reshape(-1,12*4*4)
        t = F.relu(self.fc1(t))
        #(5)FC2
        t = F.relu(self.fc2(t))
        #(6)output
        t = self.out(t)
        return t

# 调用network实例
torch.set_grad_enabled(False)
network = Network()
# 从dataloader中取出一批数据
data_loader  = torch.utils.data.DataLoader(train_set, batch_size=10)
# 调用next(iter(data_loader))，数据加载器会返回一批十张图像
batch = next(iter(data_loader))
images, labels = batch
print(images.shape)
print(labels.shape)

out:
# [bachsize一次有10张图像,1个单独的色彩通道，高度，宽度 ]
torch.Size([10, 1, 28, 28])
# 10张图像，每张图像对应一个标签
torch.Size([10])

# 将图像张量传递给网络，获得一个预测
pred = network(images)
print(pred.shape)
print(pred)

out:
# 输出的维度是10*10，10张图像，每张图像由10种预测概率
torch.Size([10, 10])
tensor([[-0.0193, -0.1122,  0.1257,  0.1487, -0.1571, -0.0396, -0.0396,  0.1130,  0.0356, -0.0452],
        [-0.0110, -0.1102,  0.1150,  0.1468, -0.1428, -0.0616, -0.0477,  0.1210,  0.0502, -0.0388],
        [-0.0192, -0.1119,  0.1058,  0.1287, -0.1349, -0.0584, -0.0487,  0.1007,  0.0328, -0.0372],
        [-0.0199, -0.1143,  0.1094,  0.1347, -0.1394, -0.0532, -0.0469,  0.1065,  0.0386, -0.0388],
        [-0.0153, -0.1097,  0.1158,  0.1431, -0.1525, -0.0596, -0.0413,  0.1135,  0.0452, -0.0478],
        [-0.0162, -0.1106,  0.1121,  0.1506, -0.1474, -0.0451, -0.0463,  0.1247,  0.0500, -0.0350],
        [-0.0336, -0.1143,  0.1142,  0.1397, -0.1467, -0.0355, -0.0486,  0.0978,  0.0278, -0.0389],
        [-0.0202, -0.1184,  0.1161,  0.1590, -0.1570, -0.0470, -0.0401,  0.1269,  0.0574, -0.0331],
        [-0.0173, -0.1080,  0.1156,  0.1301, -0.1344, -0.0578, -0.0503,  0.0940,  0.0322, -0.0312],
        [-0.0188, -0.1214,  0.1172,  0.1419, -0.1376, -0.0582, -0.0485,  0.1039,  0.0307, -0.0406]])

# argmax获得概率最大的值的索引
print(pred.argmax(dim=1))
print(labels)

out:
# 所有都预测3是最大的概率的索引
tensor([3, 3, 3, 3, 3, 3, 3, 3, 3, 3])
# 下面是图像的正确label
tensor([9, 0, 0, 3, 0, 2, 7, 2, 5, 5])

# 比较预测值和label是否相等
print(pred.argmax(dim=1).eq(labels))
# 计算预测正确的数量
print(pred.argmax(dim=1).eq(labels).sum())

out:
tensor([False, False, False,  True, False, False, False, False, False, False])
tensor(1)

print(F.softmax(pred, dim=1))
print(F.softmax(pred,dim=1).sum())

out:
tensor([[0.0975, 0.0889, 0.1127, 0.1154, 0.0850, 0.0956, 0.0956, 0.1113, 0.1030, 0.0950],
        [0.0982, 0.0890, 0.1114, 0.1150, 0.0861, 0.0934, 0.0947, 0.1121, 0.1044, 0.0955],
        [0.0981, 0.0894, 0.1112, 0.1138, 0.0874, 0.0944, 0.0953, 0.1106, 0.1034, 0.0964],
        [0.0979, 0.0890, 0.1114, 0.1142, 0.0868, 0.0946, 0.0952, 0.1110, 0.1037, 0.0960],
        [0.0981, 0.0893, 0.1119, 0.1150, 0.0855, 0.0939, 0.0956, 0.1116, 0.1042, 0.0950],
        [0.0976, 0.0888, 0.1109, 0.1153, 0.0856, 0.0948, 0.0947, 0.1123, 0.1043, 0.0958],
        [0.0967, 0.0892, 0.1121, 0.1149, 0.0863, 0.0965, 0.0952, 0.1102, 0.1028, 0.0961],
        [0.0971, 0.0880, 0.1112, 0.1161, 0.0847, 0.0945, 0.0952, 0.1125, 0.1049, 0.0958],
        [0.0982, 0.0897, 0.1121, 0.1138, 0.0873, 0.0943, 0.0950, 0.1097, 0.1032, 0.0968],
        [0.0980, 0.0885, 0.1123, 0.1151, 0.0870, 0.0942, 0.0952, 0.1108, 0.1030, 0.0959]])
tensor(10.0000)

3.11 输入张量在通过CNN的过程中的变化

3.11.1 CNN 输出特征图尺寸(正方形)

假设输入特征的大小为n x n
假设滤波器的大小为 f x f
令padding为p，步长stride为s
则输出特征图的大小为 O = ( n - f + 2p )/s + 1

3.11.2 CNN 输出特征图尺寸(非正方形)

假设输入特征的大小为 nh x nw
假设滤波器的大小为 fh x fw
令padding为p，步长stride为s
则输出特征图的高度为 Oh = (nh - fh + 2p)/s + 1
输出特征图的宽度为 Ow = (nw - fw + 2p)/s + 1

3.12 训练神经网络的步骤

3.12.1 训练神经网络的七个步骤

从训练集中获取批量数据
将批量数据传入网络
计算损失(预测值与真实值之间的差)【需要loss function实现】
计算损失函数的梯度【需要back propagation实现】
通过上一步计算的梯度来更新权重，进而减少损失【需要optimization algorithm实现】
重复1-5步直到一个epoch执行完成
重复1-6步直到所设定的epochs执行完成并得到满意的accuracy

3.12.2 单批次图像训练

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

import torchvision
import torchvision.transforms as transforms

torch.set_printoptions(linewidth=120)
torch.set_grad_enabled(True)  # 这里并不是必须的，默认情况下是打开的

print(torch.__version__)
print(torchvision.__version__)

out:
1.9.0
0.10.0

# 定义函数用于计算预测正确的数目
def get_num_correct(preds, labels):
    return preds.argmax(dim=1).eq(labels).sum().item()

# 一、训练数据获取
train_set = torchvision.datasets.FashionMNIST(
    root = './data/FashionMNIST',
    train = True,
    download = True,
    transform = transforms.Compose([
        transforms.ToTensor()
    ])
    )

# 二、创建网络
class Network(nn.Module):
    def __init__(self):
        super(Network, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
        self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
        self.fc1 = nn.Linear(in_features=12*4*4, out_features=120)
        self.fc2 = nn.Linear(in_features=120, out_features=60)
        self.out = nn.Linear(in_features=60, out_features=10)
    
    def forward(self, t):
        # Input Layer
        t = t
        # Conv1
        t = F.relu(self.conv1(t))
        t = F.max_pool2d(t, kernel_size=2, stride=2)
        # Conv2
        t = F.relu(self.conv2(t))
        t = F.max_pool2d(t, kernel_size=2, stride=2)
        # FC1
        t = t.reshape(-1, 12*4*4)
        t = F.relu(self.fc1(t))
        # FC2
        t = F.relu(self.fc2(t))
        # Output
        t = self.out(t)
        return t

# 调用network实例
network = Network()
train_loader = torch.utils.data.DataLoader(train_set, batch_size=100)
batch = next(iter(train_loader))
images, labels = batch
# 计算损失
preds = network(images)
loss = F.cross_entropy(preds,labels)   # 交叉熵损失函数
print(loss.item())   #获得损失的值

out:
2.296623468399048

print(network.conv1.weight.grad)  # 输出conv1的梯度
# 计算损失的梯度
loss.backward()      #反向传播

out：
None

# 更新权重   学习率=0.01
optimizer = optim.Adam(network.parameters(), lr =0.01)
loss.item()      # 显示当前loss值

out:
2.296623468399048

# 定义函数用于计算预测正确的数目
def get_num_correct(preds, labels):
    return preds.argmax(dim=1).eq(labels).sum().item()
print(get_num_correct(preds, labels))
# 更新权重
optimizer.step()

out:
11

preds = network(images)
loss = F.cross_entropy(preds,labels)
print(loss.item())
print(get_num_correct(preds, labels))
out:
# 这里可以看到损失值变小了，预测正确的数量也增加了
2.272930383682251
12

3.12.3 单批次网络训练步骤总结

从训练集中获取批量数据（lr为学习率:即朝着loss最小的方向走多远）

network = Network()
train_loader = torch.utils.data.DataLoader(train_set, batch_size=100)
optimizer = optim.Adam(network.parameters(), lr = 0.01)
batch = next(iter(train_loader))

将批量数据传入network

preds = network(images)

计算loss

loss = F.cross_entropy(preds, labels)

计算loss的梯度

loss.backward()

使用计算出的梯度来更新权重，从而减少loss

optimizer.step()
print('loss1:',loss.item())  #更新前的loss
preds = network(images)
loss = F.cross_entropy(preds, labels)
print('loss2:',loss.item())

3.13 单周期(epoch)CNN的训练

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

import torchvision
import torchvision.transforms as transforms

torch.set_printoptions(linewidth=120)   # 这里告诉pytorch如何显示输出
torch.set_grad_enabled(True)  # 这里并不是必须的，默认情况下是打开的，pytorch的梯度跟踪功能

print(torch.__version__)
print(torchvision.__version__)

train_set = torchvision.datasets.FashionMNIST(
    root = './data/FashionMNIST',
    train = True,
    download = True,
    transform = transforms.Compose([
        transforms.ToTensor()
    ])
    )


class Network(nn.Module):
    def __init__(self):
        super(Network, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
        self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
        self.fc1 = nn.Linear(in_features=12 * 4 * 4, out_features=120)
        self.fc2 = nn.Linear(in_features=120, out_features=60)
        self.out = nn.Linear(in_features=60, out_features=10)

    def forward(self, t):
        # Input Layer
        t = t

        # Conv1
        t = F.relu(self.conv1(t))
        t = F.max_pool2d(t, kernel_size=2, stride=2)

        # Conv2
        t = F.relu(self.conv2(t))
        t = F.max_pool2d(t, kernel_size=2, stride=2)

        # FC1
        t = t.reshape(-1, 12 * 4 * 4)
        t = F.relu(self.fc1(t))

        # FC2
        t = F.relu(self.fc2(t))

        # Output
        t = self.out(t)
        return t

# 定义函数用于计算预测正确的数目
def get_num_correct(preds, labels):
    return preds.argmax(dim=1).eq(labels).sum().item()


# 创建网络实例
network = Network()
train_loader = torch.utils.data.DataLoader(train_set, batch_size=100)
optimizer = optim.Adam(network.parameters(), lr=0.01)

flag_sum = 0  # 记录总共训练的次数

# 多次epoch
for epoch in range(5):
    total_loss = 0
    total_correct = 0

    flag_epoch = 0  # 记录一次epoch的训练次数

    # 一次epoch
    for batch in train_loader:  # Get batch，从所有的数据中得到一个bach，一个bach是100张图片
        images, labels = batch
        preds = network(images)
        loss = F.cross_entropy(preds, labels)

        # 这里梯度归零是因为当我们对损失函数进行逆向调用时（loss.backward()），新的梯度将会被计算出来，它们会添加到这些当前值中，如果不将当前值归零，就会累积梯度，
        optimizer.zero_grad()  # 告诉优化器把梯度属性中权重的梯度归零，否则pytorch会累积梯度
        loss.backward()       # 计算梯度
        # 使用梯度和学习率，梯度告诉我们走那条路，（哪个方向时损失函数的最小值），学习率告诉我们在这个方向上走多远
        optimizer.step()     # 更新权重，更新所有参数

        flag_sum += 1
        flag_epoch += 1

        total_loss += loss.item()
        total_correct += get_num_correct(preds, labels)
    print("epoch:", epoch, "loss:", total_loss, "total_correct:", total_correct)
print("flag_sum: ",flag_sum,"flag_epoch",flag_epoch)

accuracy = total_correct/len(train_set)
print("accuracy:",accuracy)



out:
1.9.0
0.10.0
epoch: 0 loss: 333.44036097824574 total_correct: 47315
epoch: 1 loss: 229.44155816733837 total_correct: 51582
epoch: 2 loss: 209.4198594391346 total_correct: 52302
epoch: 3 loss: 199.58387261629105 total_correct: 52657
epoch: 4 loss: 194.86104479432106 total_correct: 52838
flag_sum:  3000 flag_epoch 600
accuracy: 0.8806333333333334

每个周期的迭代数(flag_epoch ) = 数据总数/batchsize（当改变batchsize时，也就是改变了更新权重的次数，也就是朝损失函数最小的防线前进的步数）
accuracy = total_correct/len(train_set)
梯度：告诉我们应该走哪条路能更快的到达loss最小
使用梯度和学习率，梯度告诉我们走那条路，（哪个方向时损失函数的最小值），学习率告诉我们在这个方向上走多远

3.14 神经网络的混淆矩阵

创建混淆矩阵的两个条件:一个预测的张量和一个有相应真值或标签的张量

CNN中的混淆矩阵 | PyTorch系列（二十三）
这个超链接中和本节课讲解一本一样，并且非常详细，强烈建议观看

# 在3.13训练后网络的基础上进行分析
len(train_set)
60000
len(train_set.targets)
60000

对整个训练集进行预测

def get_all_preds(model,loader):
    all_preds = torch.tensor([])
    for batch in loader:
        images,labels = batch
        preds = model(images)
        all_preds = torch.cat((all_preds,preds), dim=0)
    return all_preds  # 返回所有的预测结果

prediction_loader = torch.utils.data.DataLoader(train_set, batch_size=10000)
train_preds = get_all_preds(network, prediction_loader)
print(train_preds.shape)

out:
torch.Size([60000, 10])

print(train_preds.requires_grad)   #查看训练预测张量的梯度属性

train_preds.grad
# 即使训练中关于梯度张量的跟踪已打开，但在没有进行反向传播的情况下依旧不会有梯度的值

train_preds.grad_fn   # 由于train_preds是经过函数产生的，故具有该属性


out:
True

# 局部关闭梯度跟踪以减小内存损耗,也可使用torch.set.grad.enabled(False)进行全局关闭
with torch.no_grad():
    prediction_loader = torch.utils.data.DataLoader(train_set, batch_size=1000)
    train_preds = get_all_preds(network, prediction_loader)

len(train_preds)
60000
print(train_preds.requires_grad)
False
train_preds.grad
train_preds.grad_fn
preds_correct = get_num_correct(train_preds, train_set.targets)
print("total_correct:",preds_correct)
print("accuracy:",preds_correct/len(train_set))
total_correct: 51988
accuracy: 0.8664666666666667

绘制混淆矩阵(方法1：)

print(train_set.targets)
print(train_set.targets.shape)

out:
tensor([9, 0, 0,  ..., 3, 0, 5])
torch.Size([60000])

print(train_preds.argmax(dim=1))
print(train_preds.argmax(dim=1).shape)

out:
tensor([9, 0, 0,  ..., 3, 0, 5])
torch.Size([60000])

stack = torch.stack((train_set.targets, train_preds.argmax(dim=1)),dim=1)
print(stack)

out:
tensor([[9, 9],
        [0, 0],
        [0, 0],
        ...,
        [3, 3],
        [0, 0],
        [5, 5]])

# 使用tolist方法可访问【target，pred】对
print(stack[0].tolist())

out:
[9, 9]

# 创建一个混淆矩阵(初始)
cmt = torch.zeros(10,10,dtype=torch.int32)
# 遍历所有的对，并计算每个组合发生的次数
for p in stack:
    tl,pl = p.tolist()
    cmt[tl,pl] = cmt[tl,pl] + 1
print(cmt)

out:
tensor([[5661,    5,   77,   73,    8,    2,  117,    1,   56,    0],
        [  64, 5774,    5,  128,    5,    1,   20,    0,    3,    0],
        [ 111,    1, 4692,   82,  768,    1,  299,    0,   46,    0],
        [ 546,   20,   20, 5216,  138,    0,   56,    0,    4,    0],
        [  21,    6,  364,  297, 4830,    0,  419,    5,   58,    0],
        [  27,    6,    8,    1,    0, 5665,    2,  213,    8,   70],
        [1871,    9,  612,  127,  498,    0, 2792,    0,   91,    0],
        [   0,    0,    0,    0,    0,   49,    0, 5846,    3,  102],
        [  40,    1,   23,   20,   13,   15,   25,   15, 5846,    2],
        [   1,    0,    1,    0,    0,   20,    0,  307,    5, 5666]], dtype=torch.int32)

import matplotlib.pyplot as plt
from resources.plotcm import plot_confusion_matrix
# 请注意plotcm是一个文件plotcm.py，位于当前目录中的资源文件夹中。在plotcm.py文件中，有一个称为plot_confusion_matrix（）的函数，我们将调用该函数。或者直接在当前py中定义这个函数（但是对于主函数定义太多函数，会导致代码太长，不便于观看和理解）
names = (
    'T-shirt/top',
    'Trouser',
    'Pullover',
    'Dress',
    'Coat',
    'Sandal',
    'Shirt',
    'Sneaker',
    'Bag',
    'Ankle boot')
plt.figure(figsize=(10,10))
plot_confusion_matrix(cmt, names)

out:
Confusion matrix, without normalization
tensor([[5190,   10,   19,  216,   53,   15,  454,    0,   43,    0],
        [  17, 5784,    6,  149,   13,    4,   18,    0,    9,    0],
        [  71,    2, 3313,   45, 1962,    4,  563,    0,   40,    0],
        [ 211,   33,   10, 5230,  402,    3,  106,    0,    4,    1],
        [   4,    8,  104,  107, 5558,    2,  194,    0,   23,    0],
        [   3,    1,    0,    1,    1, 5493,    0,  435,    4,   62],
        [1228,    6,  278,  117, 1305,    4, 2957,    0,  104,    1],
        [   0,    0,    0,    0,    0,   47,    0, 5885,    1,   67],
        [  33,    4,   11,   28,   47,   32,   59,   13, 5771,    2],
        [   0,    0,    0,    0,    1,   52,    0,  645,    7, 5295]], dtype=torch.int32)

plot_confusion_matrix函数的定义：

第一种：看起来更清楚简洁
plotcm.py为

import itertools
import numpy as np
import matplotlib.pyplot as plt

def plot_confusion_matrix(cm, classes, normalize=False, title='Confusion matrix', cmap=plt.cm.Blues):
    if normalize:
        cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
        print("Normalized confusion matrix")
    else:
        print('Confusion matrix, without normalization')

    print(cm)
    plt.imshow(cm, interpolation='nearest', cmap=cmap)
    plt.title(title)
    plt.colorbar()
    tick_marks = np.arange(len(classes))
    plt.xticks(tick_marks, classes, rotation=45)
    plt.yticks(tick_marks, classes)

    fmt = '.2f' if normalize else 'd'
    thresh = cm.max() / 2.
    for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
        plt.text(j, i, format(cm[i, j], fmt), horizontalalignment="center",
                 color="white" if cm[i, j] > thresh else "black")

    plt.tight_layout()
    plt.ylabel('True label')
    plt.xlabel('Predicted label')

输出样例：

第二种：
plotcm.py为

# 定义绘制混淆矩阵函数
def plot_confusion_matrix(cm, labels_name, title):
    cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]    # 归一化
    plt.imshow(cm, interpolation='nearest')    # 在特定的窗口上显示图像
    plt.title(title)    # 图像标题
    plt.colorbar()
    num_local = np.array(range(len(labels_name)))
    plt.xticks(num_local, labels_name, rotation=90)    # 将标签印在x轴坐标上
    plt.yticks(num_local, labels_name)    # 将标签印在y轴坐标上
    plt.ylabel('True label')
    plt.xlabel('Predicted label')

输出样例：

绘制混淆矩阵(方法2：)

import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix   # 需要install scikit-learn包
from resources.plotcm import plot_confusion_matrix  # plotcm.py文件位于当前文件resources中

cm = confusion_matrix(train_set.targets, train_preds.argmax(dim=1))
print(cm)
names = (
    'T-shirt/top',
    'Trouser',
    'Pullover',
    'Dress',
    'Coat',
    'Sandal',
    'Shirt',
    'Sneaker',
    'Bag',
    'Ankle boot')
plt.figure(figsize=(10,10))
plot_confusion_matrix(cm, names)

out:

Confusion matrix, without normalization
tensor([[5190,   10,   19,  216,   53,   15,  454,    0,   43,    0],
        [  17, 5784,    6,  149,   13,    4,   18,    0,    9,    0],
        [  71,    2, 3313,   45, 1962,    4,  563,    0,   40,    0],
        [ 211,   33,   10, 5230,  402,    3,  106,    0,    4,    1],
        [   4,    8,  104,  107, 5558,    2,  194,    0,   23,    0],
        [   3,    1,    0,    1,    1, 5493,    0,  435,    4,   62],
        [1228,    6,  278,  117, 1305,    4, 2957,    0,  104,    1],
        [   0,    0,    0,    0,    0,   47,    0, 5885,    1,   67],
        [  33,    4,   11,   28,   47,   32,   59,   13, 5771,    2],
        [   0,    0,    0,    0,    1,   52,    0,  645,    7, 5295]], dtype=torch.int32)

3.15 concatenating和stacking的区分

concatenating（cat）是在一个现有的轴上连接一系列的张量
stacking（stack）是在一个新的轴上连接一系列的张量(即，我们在所有的张量中创建一个新轴)

对于具体的细节和动作图例子，可以查看 p28视频

# 给张量创建新轴
import torch
t = torch.tensor([1,1,1]) 
print(t.unsqueeze(dim=0))
print(t.unsqueeze(dim=0).shape)
print(t.unsqueeze(dim=1))
print(t.unsqueeze(dim=1).shape)

out:
tensor([[1, 1, 1]])
torch.Size([1, 3])
tensor([[1],
        [1],
        [1]])
torch.Size([3, 1])

# 使用Pytorch实现concatenating和stacking
t1 = torch.tensor([1,1,1])
t2 = torch.tensor([2,2,2])
t3 = torch.tensor([3,3,3])
# Concatenating
t_cat = torch.cat((t1,t2,t3), dim=0)
print(t_cat)

# Stacking
t_stack = torch.stack((t1, t2, t3), dim=0)
print(t_stack)

# Staking相当于先给张量添加一个新轴然后在concat
t_stack1 = torch.cat((t1.unsqueeze(0),t2.unsqueeze(0),t3.unsqueeze(0)), dim =0)
print(t1.unsqueeze(0))
print(t_stack1)


out:
tensor([1, 1, 1, 2, 2, 2, 3, 3, 3])
tensor([[1, 1, 1],
        [2, 2, 2],
        [3, 3, 3]])
tensor([[1, 1, 1]])
tensor([[1, 1, 1],
        [2, 2, 2],
        [3, 3, 3]])

Tensorflow的代码我没有尝试

# 使用Tensorflow实现concatenating和stacking
import tensorflow as tf
# Concatenating
t_cat = tf.concat((t1, t2, t3), axis =0)
print(t_cat)

#Stacking
t_stack = tf.concat((t1, t2, t3), axis =0)
print(t_stack)


out:
Tensor("concat:0", shape=(9,), dtype=int64)
Tensor("concat_1:0", shape=(9,), dtype=int64)

# 使用Numpy实现concatenating和stacking
import numpy as np
t1 = np.array([1,1,1])
t2 = np.array([2,2,2])
t3 = np.array([3,3,3])
# Concatenating
t_cat = np.concatenate((t1,t2,t3), axis=0)   
print(t_cat)
# Stacking
t_stack = np.stack((t1,t2,t3), axis =0)
print(t_stack)
t_cat_to_stack = np.concatenate(
    (
        np.expand_dims(t1, 0)
       ,np.expand_dims(t2, 0)
       ,np.expand_dims(t3, 0)
    )
    ,axis=0
)

out:
[1 1 1 2 2 2 3 3 3]
[[1 1 1]
 [2 2 2]
 [3 3 3]]
 [[1 1 1]
 [2 2 2]
 [3 3 3]]

总结：一次完整训练和绘图的完整代码

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

import itertools
from sklearn.metrics import confusion_matrix    # 生成混淆矩阵函数
import matplotlib.pyplot as plt
# from resources.plotcm import plot_confusion_matrix

import numpy as np
import torchvision
import torchvision.transforms as transforms

torch.set_printoptions(linewidth=120)   # 这里告诉pytorch如何显示输出
torch.set_grad_enabled(True)  # 这里并不是必须的，默认情况下是打开的，pytorch的梯度跟踪功能

print(torch.__version__)
print(torchvision.__version__)

train_set = torchvision.datasets.FashionMNIST(
    root = './data/FashionMNIST',
    train = True,
    download = True,
    transform = transforms.Compose([
        transforms.ToTensor()
    ])
    )


class Network(nn.Module):
    def __init__(self):
        super(Network, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
        self.conv2 = nn.Conv2d(in_channels=6, out_channels=12, kernel_size=5)
        self.fc1 = nn.Linear(in_features=12 * 4 * 4, out_features=120)
        self.fc2 = nn.Linear(in_features=120, out_features=60)
        self.out = nn.Linear(in_features=60, out_features=10)

    def forward(self, t):
        # Input Layer
        t = t

        # Conv1
        t = F.relu(self.conv1(t))
        t = F.max_pool2d(t, kernel_size=2, stride=2)

        # Conv2
        t = F.relu(self.conv2(t))
        t = F.max_pool2d(t, kernel_size=2, stride=2)

        # FC1
        t = t.reshape(-1, 12 * 4 * 4)
        t = F.relu(self.fc1(t))

        # FC2
        t = F.relu(self.fc2(t))

        # Output
        t = self.out(t)
        return t

# 定义函数用于计算预测正确的数目
def get_num_correct(preds, labels):
    return preds.argmax(dim=1).eq(labels).sum().item()


# 创建网络实例
network = Network()
train_loader = torch.utils.data.DataLoader(train_set, batch_size=100)
optimizer = optim.Adam(network.parameters(), lr=0.01)

flag_sum = 0  # 记录总共训练的次数

# 多次epoch，这里可以自己进行设置
for epoch in range(1):
    total_loss = 0
    total_correct = 0

    flag_epoch = 0  # 记录一次epoch的训练次数

    # 一次epoch
    for batch in train_loader:  # Get batch，从所有的数据中得到一个bach，一个bach是100张图片
        images, labels = batch
        preds = network(images)
        loss = F.cross_entropy(preds, labels)

        # 这里梯度归零是因为当我们对损失函数进行逆向调用时（loss.backward()），新的梯度将会被计算出来，它们会添加到这些当前值中，如果不将当前值归零，就会累积梯度，
        optimizer.zero_grad()  # 告诉优化器把梯度属性中权重的梯度归零，否则pytorch会累积梯度
        loss.backward()       # 计算梯度
        # 使用梯度和学习率，梯度告诉我们走那条路，（哪个方向时损失函数的最小值），学习率告诉我们在这个方向上走多远
        optimizer.step()     # 更新权重，更新所有参数

        flag_sum += 1
        flag_epoch += 1

        total_loss += loss.item()
        total_correct += get_num_correct(preds, labels)
    print("epoch:", epoch, "loss:", total_loss, "total_correct:", total_correct)
print("flag_sum: ",flag_sum,"flag_epoch",flag_epoch)

accuracy = total_correct/len(train_set)
print("accuracy:",accuracy)


# 在3.13训练后网络的基础上进行分析
len(train_set)
len(train_set.targets)

# 获得所有的预测结果
def get_all_preds(model,loader):
    all_preds = torch.tensor([])
    for batch in loader:
        images,labels = batch
        preds = model(images)
        all_preds = torch.cat((all_preds,preds), dim=0)
    return all_preds

# 定义绘制混淆矩阵函数
def plot_confusion_matrix(cm, labels_name, title):
    cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]    # 归一化
    plt.imshow(cm, interpolation='nearest')    # 在特定的窗口上显示图像
    plt.title(title)    # 图像标题
    plt.colorbar()
    num_local = np.array(range(len(labels_name)))
    plt.xticks(num_local, labels_name, rotation=90)    # 将标签印在x轴坐标上
    plt.yticks(num_local, labels_name)    # 将标签印在y轴坐标上
    plt.ylabel('True label')
    plt.xlabel('Predicted label')

# 定义绘制混淆矩阵函数
def plot_confusion_matrix_1(cm, labels_name, title):
    #cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]    # 归一化
    plt.imshow(cm, interpolation='nearest')    # 在特定的窗口上显示图像
    plt.title(title)    # 图像标题
    plt.colorbar()
    num_local = np.array(range(len(labels_name)))
    plt.xticks(num_local, labels_name, rotation=90)    # 将标签印在x轴坐标上
    plt.yticks(num_local, labels_name)    # 将标签印在y轴坐标上
    plt.ylabel('True label')
    plt.xlabel('Predicted label')


def plot_confusion_matrix_2(cm, classes, normalize=False, title='Confusion matrix', cmap=plt.cm.Blues):
    if normalize:
        cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
        print("Normalized confusion matrix")
    else:
        print('Confusion matrix, without normalization')

    print(cm)
    plt.imshow(cm, interpolation='nearest', cmap=cmap)
    plt.title(title)
    plt.colorbar()
    tick_marks = np.arange(len(classes))
    plt.xticks(tick_marks, classes, rotation=45)
    plt.yticks(tick_marks, classes)

    fmt = '.2f' if normalize else 'd'
    thresh = cm.max() / 2.
    for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
        plt.text(j, i, format(cm[i, j], fmt), horizontalalignment="center",
                 color="white" if cm[i, j] > thresh else "black")

    plt.tight_layout()
    plt.ylabel('True label')
    plt.xlabel('Predicted label')



prediction_loader = torch.utils.data.DataLoader(train_set, batch_size=10000)
train_preds = get_all_preds(network, prediction_loader)

print(train_preds.shape)

print(train_preds.requires_grad)   #查看训练预测张量的梯度属性

print(train_preds.grad)
# 即使训练中关于梯度张量的跟踪已打开，但在没有进行反向传播的情况下依旧不会有梯度的值

print(train_preds.grad_fn)   # 由于train_preds是经过函数产生的，故具有该属性

# 局部关闭梯度跟踪以减小内存损耗,也可使用torch.set.grad.enabled(False)进行全局关闭
with torch.no_grad():
    prediction_loader = torch.utils.data.DataLoader(train_set, batch_size=1000)
    train_preds = get_all_preds(network, prediction_loader)

len(train_preds)

print(train_preds.requires_grad)

print(train_preds.grad)

print(train_preds.grad_fn)

preds_correct = get_num_correct(train_preds, train_set.targets)
print("total_correct:",preds_correct)
print("accuracy:",preds_correct/len(train_set))

print(train_set.targets)
print(train_set.targets.shape)

print(train_preds.argmax(dim=1))
print(train_preds.argmax(dim=1).shape)

stack = torch.stack((train_set.targets, train_preds.argmax(dim=1)),dim=1)
print(stack)


# 使用tolist方法可访问【target，pred】对
print(stack[0].tolist())

# 创建一个混淆矩阵(初始)
cmt = torch.zeros(10,10,dtype=torch.int32)

# 遍历所有的对，并计算每个组合发生的次数
for p in stack:
    tl,pl = p.tolist()
    cmt[tl,pl] = cmt[tl,pl] + 1

print(cmt)
cm = confusion_matrix(train_set.targets, train_preds.argmax(dim=1))
names = (
    'T-shirt/top',
    'Trouser',
    'Pullover',
    'Dress',
    'Coat',
    'Sandal',
    'Shirt',
    'Sneaker',
    'Bag',
    'Ankle boot')
plt.figure(figsize=(10, 10))
plot_confusion_matrix(cm, names, "pred")
plt.show()

plt.figure(figsize=(10, 10))
plot_confusion_matrix_1(cmt, names, "haha")
plt.show()

plt.figure(figsize=(10, 10))
plot_confusion_matrix_2(cmt, names)
plt.show()

out:

1.9.0
0.10.0
epoch: 0 loss: 344.86296156048775 total_correct: 46965
flag_sum:  600 flag_epoch 600
accuracy: 0.78275
torch.Size([60000, 10])
True
None

False
None
None
total_correct: 50476
accuracy: 0.8412666666666667
tensor([9, 0, 0,  ..., 3, 0, 5])
torch.Size([60000])
tensor([9, 0, 3,  ..., 3, 0, 5])
torch.Size([60000])
tensor([[9, 9],
        [0, 0],
        [0, 3],
        ...,
        [3, 3],
        [0, 0],
        [5, 5]])
[9, 9]
tensor([[5190,   10,   19,  216,   53,   15,  454,    0,   43,    0],
        [  17, 5784,    6,  149,   13,    4,   18,    0,    9,    0],
        [  71,    2, 3313,   45, 1962,    4,  563,    0,   40,    0],
        [ 211,   33,   10, 5230,  402,    3,  106,    0,    4,    1],
        [   4,    8,  104,  107, 5558,    2,  194,    0,   23,    0],
        [   3,    1,    0,    1,    1, 5493,    0,  435,    4,   62],
        [1228,    6,  278,  117, 1305,    4, 2957,    0,  104,    1],
        [   0,    0,    0,    0,    0,   47,    0, 5885,    1,   67],
        [  33,    4,   11,   28,   47,   32,   59,   13, 5771,    2],
        [   0,    0,    0,    0,    1,   52,    0,  645,    7, 5295]], dtype=torch.int32)
Confusion matrix, without normalization
tensor([[5190,   10,   19,  216,   53,   15,  454,    0,   43,    0],
        [  17, 5784,    6,  149,   13,    4,   18,    0,    9,    0],
        [  71,    2, 3313,   45, 1962,    4,  563,    0,   40,    0],
        [ 211,   33,   10, 5230,  402,    3,  106,    0,    4,    1],
        [   4,    8,  104,  107, 5558,    2,  194,    0,   23,    0],
        [   3,    1,    0,    1,    1, 5493,    0,  435,    4,   62],
        [1228,    6,  278,  117, 1305,    4, 2957,    0,  104,    1],
        [   0,    0,    0,    0,    0,   47,    0, 5885,    1,   67],
        [  33,    4,   11,   28,   47,   32,   59,   13, 5771,    2],
        [   0,    0,    0,    0,    1,   52,    0,  645,    7, 5295]], dtype=torch.int32)

Process finished with exit code 0

你可能感兴趣的:(Pytorch,pytorch,神经网络,深度学习)

PyTorch & TensorFlow速成复习：从基础语法到模型部署实战（附FPGA移植衔接）阿牛的药铺算法移植部署 pytorch tensorflow fpga开发
PyTorch&TensorFlow速成复习：从基础语法到模型部署实战（附FPGA移植衔接）引言：为什么算法移植工程师必须掌握框架基础？针对光学类产品算法FPGA移植岗位需求（如可见光/红外图像处理），深度学习框架是算法落地的"桥梁"——既要用PyTorch/TensorFlow验证算法可行性，又要将训练好的模型（如CNN、目标检测）转换为FPGA可部署的格式（ONNX、TFLite）。本文采用"
深度学习模型表征提取全解析 ZhangJiQun&MXP 教学 2024大模型以及算力 2021 AI python 深度学习人工智能 python embedding 语言模型
模型内部进行表征提取的方法在自然语言处理（NLP）中，“表征（Representation）”指将文本（词、短语、句子、文档等）转化为计算机可理解的数值形式（如向量、矩阵），核心目标是捕捉语言的语义、语法、上下文依赖等信息。自然语言表征技术可按“静态/动态”“有无上下文”“是否融入知识”等维度划分一、传统静态表征（无上下文，词级为主）这类方法为每个词分配固定向量，不考虑其在具体语境中的含义（无法解
【Qualcomm】高通SNPE框架简介、下载与使用 Jackilina_Stone 人工智能 Qualcomm SNPE
目录一高通SNPE框架1SNPE简介2QNN与SNPE3Capabilities4工作流程二SNPE的安装与使用1下载2Setup3SNPE的使用概述一高通SNPE框架1SNPE简介SNPE（SnapdragonNeuralProcessingEngine），是高通公司推出的面向移动端和物联网设备的深度学习推理框架。SNPE提供了一套完整的深度学习推理框架，能够支持多种深度学习模型，包括Pytor
vllm本地部署bge-reranker-v2-m3模型API服务实战教程雷电法王大模型部署 linux python vscode language model
文章目录一、说明二、配置环境2.1安装虚拟环境2.2安装vllm2.3对应版本的pytorch安装2.4安装flash_attn2.5下载模型三、运行代码3.1启动服务3.2调用代码验证一、说明本文主要介绍vllm本地部署BAAI/bge-reranker-v2-m3模型API服务实战教程本文是在Ubuntu24.04+CUDA12.8+Python3.12环境下复现成功的二、配置环境2.1安装虚
深度学习篇---昇腾NPU&CANN 工具包 Atticus-Orion 上位机知识篇图像处理篇深度学习篇深度学习人工智能 NPU 昇腾 CANN
介绍昇腾NPU是华为推出的神经网络处理器，具有强大的AI计算能力，而CANN工具包则是面向AI场景的异构计算架构，用于发挥昇腾NPU的性能优势。以下是详细介绍：昇腾NPU架构设计：采用达芬奇架构，是一个片上系统，主要由特制的计算单元、大容量的存储单元和相应的控制单元组成。集成了多个CPU核心，包括控制CPU和AICPU，前者用于控制处理器整体运行，后者承担非矩阵类复杂计算。此外，还拥有AICore
深度学习图像分类数据集—桃子识别分类 AI街潜水的八角深度学习图像数据集深度学习分类人工智能
该数据集为图像分类数据集，适用于ResNet、VGG等卷积神经网络，SENet、CBAM等注意力机制相关算法，VisionTransformer等Transformer相关算法。数据集信息介绍：桃子识别分类：['B1','M2','R0','S3']训练数据集总共有6637张图片，每个文件夹单独放一种数据各子文件夹图片统计:·B1:1601张图片·M2:1800张图片·R0:1601张图片·S3:
使用NVIDIA NeRF将2D图像转换为逼真的3D模型（Python） ByteWhiz 3d python 计算机视觉 Python
使用NVIDIANeRF将2D图像转换为逼真的3D模型（Python）NeuralRadianceFields（NeRF）是一种强大的方法，可以将2D图像转换为逼真的3D模型。它使用神经网络来建模场景的辐射场，并通过渲染多个视角的图像来重建3D模型。在本文中，我们将使用Python和NVIDIANeRF库来实现这一过程。首先，我们需要安装所需的库。我们可以通过以下命令使用pip安装NVIDIANe
NumPy-@运算符详解 GG不是gg numpy numpy
NumPy-@运算符详解一、@运算符的起源与设计目标1.从数学到代码：符号的统一2.设计目标二、@运算符的核心语法与运算规则1.基础用法：二维矩阵乘法2.一维向量的矩阵语义3.高维数组：批次矩阵运算4.广播机制：灵活的形状匹配三、@运算符与其他乘法方式的核心区别1.对比`np.dot()`2.对比元素级乘法`*`3.对比`np.matrix`的`*`运算符四、典型应用场景：从基础到高阶1.深度学习
NLP_知识图谱_大模型——个人学习记录 macken9999 自然语言处理知识图谱大模型自然语言处理知识图谱学习
1.自然语言处理、知识图谱、对话系统三大技术研究与应用https://github.com/lihanghang/NLP-Knowledge-Graph深度学习-自然语言处理(NLP)-知识图谱：知识图谱构建流程【本体构建、知识抽取（实体抽取、关系抽取、属性抽取）、知识表示、知识融合、知识存储】-元気森林-博客园https://www.cnblogs.com/-402/p/16529422.htm
解决 Python 包安装失败问题：以 accelerate 为例
在使用Python开发项目时，我们经常会遇到依赖包安装失败的问题。今天，我们就以accelerate包为例，详细探讨一下可能的原因以及解决方法。通过这篇文章，你将了解到Python包安装失败的常见原因、如何切换镜像源、如何手动安装包，以及一些实用的注意事项。一、问题背景在开发一个深度学习项目时，我需要安装accelerate包来优化模型的训练过程。然而，当我运行以下命令时：bash复制pipins
图神经网络：挖掘关系数据中的宝藏
图神经网络：挖掘关系数据中的宝藏在浩瀚的数据海洋中，蕴藏着一类特殊而强大的资源——关系数据。它们不是孤立的点，而是相互连接、彼此影响的复杂网络：社交平台上朋友的朋友、电商系统中商品与用户的互动、蛋白质分子内原子的结合、城市交通网中的道路连接……这些数据天然以图的形式存在，节点代表实体，边则承载着实体间千丝万缕的关系。传统的数据挖掘工具面对这些盘根错节的结构往往力不从心，而图神经网络（GNN）的崛起
从RNN循环神经网络到Transformer注意力机制：解析神经网络架构的华丽蜕变熊猫钓鱼>_> 神经网络 rnn transformer
1.引言在自然语言处理和序列建模领域，神经网络架构经历了显著的演变。从早期的循环神经网络（RNN）到现代的Transformer架构，这一演变代表了深度学习方法在处理序列数据方面的重大进步。本文将深入比较这两种架构，分析它们的工作原理、优缺点，并通过实验结果展示它们在实际应用中的性能差异。2.循环神经网络（RNN）2.1基本原理循环神经网络是专门为处理序列数据而设计的神经网络架构。RNN的核心思想
pycharm无法识别conda环境（已解决） Reborker pycharm conda ide
文章目录前言研究过程解决办法前言好久不用pycharm了，打开后提示更新，更新到了2023.1版本。安装conda后在新建了一个虚拟环境pytorch，但是无论是基础环境还是虚拟环境，pycharm都识别不出conda里的python.exe(如图)。如果不想看啰嗦直接看后面的解决办法，比较闲的话可以看看我的研究过程。研究过程看了很多博客，尝试了以下解决办法：加载conda.bat文件，虽然出现了
jetson agx orin 刷机、cuda、pytorch配置指南【亲测有效】
jetsonagxorin刷机指南注意事项刷机具体指南cuda环境配置指南Anconda、Pytorch配置注意事项1.使用设备自带usbtoc的传输线时，注意c口插到orin左侧的口，右侧的口不支持数据传输；2.刷机时需准备ubuntu系统，可以是虚拟机，注意安装SDKManager刷机时，JetPack版本要选对，JetPack6.0的对应ubuntu22，cuda12版本，对应pytorch
如何使用Python实现交通工具识别
如何使用Python实现交通工具识别文章目录技术架构功能流程识别逻辑用户界面增强特性依赖项主要类别内容展示该系统是一个基于深度学习的交通工具识别工具，具备以下核心功能与特点：技术架构使用预训练的ResNet50卷积神经网络模型（来自ImageNet数据集）集成图像增强预处理技术（随机裁剪、旋转、翻转等）采用多数投票机制提升预测稳定性基于置信度评分的结果筛选策略功能流程用户通过GUI界面选择待识别图
【EGSR2025】材质+扩散模型+神经网络相关论文整理随笔（四） Superstarimage 文献随笔材质神经网络人工智能扩散模型
AnevaluationofSVBRDFPredictionfromGenerativeImageModelsforAppearanceModelingof3DScenes输入3D场景的几何和一张参考图像，通过扩散模型和SVBRDF预测器获取多视角的材质maps，这些maps最终合并成场景的纹理地图集，并支持在任意视角、任意光照条件下进行重新渲染。样例图如下：在当前时代的技术背景下，生成与几何匹配
Yolov5-obb(旋转目标poly_nms_cuda.cu编译bug记录及解决方案)
关于在执行pythonsetup.pydevelop#or"pipinstall-v-e."时poly_nms_cuda.cu报错问题。前面步骤严格按照install.md环境1.pytorch版本较低时（我的是1.10）：poly_nms_cuda.cu文件添加”#defineeps1e-8“，删除“constdoubleeps=1E-8;”这句2.pytorch版本较高时（我用的是1.27）h
Python OpenCV教程从入门到精通的全面指南【文末送书】一键难忘 python opencv 开发语言
文章目录PythonOpenCV从入门到精通1.安装OpenCV2.基本操作2.1读取和显示图像2.2图像基本操作3.图像处理3.1图像转换3.2图像阈值处理3.3图像平滑4.边缘检测和轮廓4.1Canny边缘检测4.2轮廓检测5.高级操作5.1特征检测5.2目标跟踪5.3深度学习与OpenCVPythonOpenCV从入门到精通【文末送书】PythonOpenCV从入门到精通OpenCV(Ope
CNN 猫狗识别：从理论到实战的深度解析爱熬夜的小古 cnn 深度学习人工智能
在计算机视觉领域，卷积神经网络（ConvolutionalNeuralNetwork，CNN）凭借其强大的特征提取和模式识别能力，成为图像分类任务的主流技术。猫狗识别作为经典的图像分类问题，不仅能帮助我们理解CNN的工作原理，还能为实际应用提供技术支持。本文将深入探讨CNN在猫狗识别中的应用，从理论基础到实战代码，带你全面掌握这项技术。一、CNN基础理论概述（一）CNN的核心组件卷积层：是CNN的
第八周 tensorflow实现猫狗识别降花绘 365天深度学习 tensorflow系列 tensorflow 深度学习人工智能
本文为365天深度学习训练营内部限免文章（版权归K同学啊所有）**参考文章地址：[TensorFlow入门实战｜365天深度学习训练营-第8周：猫狗识别（训练营内部成员可读）]**作者：K同学啊文章目录一、本周学习内容:1、自己搭建VGG16网络2、了解model.train_on_batch（）3、了解tqdm，并使用tqdm实现可视化进度条二、前言三、电脑环境四、前期准备1、导入相关依赖项2、
深度学习实战-使用TensorFlow与Keras构建智能模型程序员Gloria Python超入门 TensorFlow python
深度学习实战-使用TensorFlow与Keras构建智能模型深度学习已经成为现代人工智能的重要组成部分，而Python则是实现深度学习的主要编程语言之一。本文将探讨如何使用TensorFlow和Keras构建深度学习模型，包括必要的代码实例和详细的解析。1.深度学习简介深度学习是机器学习的一个分支，使用多层神经网络来学习和表示数据中的复杂模式。其广泛应用于图像识别、自然语言处理、推荐系统等领域。
AI在垂直领域的深度应用：医疗、金融与自动驾驶的革新之路
AI在垂直领域的深度应用：医疗、金融与自动驾驶的革新之路一、医疗领域：AI驱动的精准诊疗与效率提升1.医学影像诊断AI算法通过深度学习技术，已实现对X光、CT、MRI等影像的快速分析，辅助医生检测癌症、骨折等疾病。例如，GoogleDeepMind的AI系统在乳腺癌筛查中，误检率比人类专家低9.4%；中国的推想医疗AI系统可在20秒内完成肺部CT扫描分析，为急诊救治争取黄金时间。2.药物研发传统药
专题：2025云计算与AI技术研究趋势报告|附200+份报告PDF、原数据表汇总下载
原文链接：https://tecdat.cn/?p=42935关键词：2025,云计算，AI技术，市场趋势，深度学习，公有云，研究报告云计算和AI技术正以肉眼可见的速度重塑商业世界。过去十年，全球云服务收入激增8倍，中国云计算市场规模突破6000亿元，而深度学习算法的应用量更是暴涨400倍。这些数字背后，是企业从“自建机房”到“云原生开发”的转型，是AI从“实验室”走向“产业级应用”的跨越。本报告
【深度学习解惑】在实践中如何发现和修正RNN训练过程中的数值不稳定？云博士的AI课堂大模型技术开发与实践哈佛博后带你玩转机器学习深度学习深度学习 rnn 人工智能 tensorflow pytorch 神经网络机器学习
在实践中发现和修正RNN训练过程中的数值不稳定目录引言与背景介绍原理解释代码说明与实现应用场景与案例分析实验设计与结果分析性能分析与技术对比常见问题与解决方案创新性与差异性说明局限性与挑战未来建议和进一步研究扩展阅读与资源推荐图示与交互性内容语言风格与通俗化表达互动交流1.引言与背景介绍循环神经网络(RNN)在处理序列数据时表现出色，但训练过程中常面临梯度消失和梯度爆炸问题，导致数值不稳定。当网络
【深度学习实战】当前三个最佳图像分类模型的代码详解云博士的AI课堂大模型技术开发与实践哈佛博后带你玩转机器学习深度学习深度学习人工智能分类模型机器学习 Transformer EfficientNet ConvNeXt
下面给出三个在当前图像分类任务中精度表现突出的模型示例，分别基于SwinTransformer、EfficientNet与ConvNeXt。每个模型均包含：训练代码（使用PyTorch）从预训练权重开始微调（也可注释掉预训练选项，从头训练）数据集目录结构：└──dataset_root├──buy#第一类图像└──nobuy#第二类图像随机拆分：80%训练，20%验证每个Epoch输出一次loss
第35周—————糖尿病预测模型优化探索
目录目录前言1.检查GPU2.查看数据编辑3.划分数据集4.创建模型与编译训练5.编译及训练模型6.结果可视化7.总结前言本文为365天深度学习训练营中的学习记录博客原作者：K同学啊1.检查GPUimporttorch.nnasnnimporttorch.nn.functionalasFimporttorchvision,torch#设置硬件设备，如果有GPU则使用，没有则使用cpudevice=
《从依赖纠缠到接口协作：ASP.NET Core注入式开发指南》后端
在C#的ASP.NETCore开发中，依赖注入绝非简单的技术技巧，而是重构代码关系的底层逻辑。它像一套隐形的神经网络，让程序模块摆脱硬编码的束缚，在运行时实现动态连接，从而为系统注入可测试、可进化的核心生命力。理解其深层价值，需要穿透"服务注册与获取"的表层操作，触及它对软件设计哲学的重塑。依赖注入的本质，是对"依赖关系"的去中心化治理。传统开发中，模块间的依赖如同藤蔓缠绕的树木，一个组件直接创建
深度学习预备知识 AmazingMQ 深度学习人工智能
1.Tensor张量定义：张量（tensor）表示一个由数值组成的数组，这个数组可能有多个维度（轴）。具有一个轴的张量对应数学上的向量，具有两个轴的张量对应数学上的矩阵，具有两个以上轴的张量目前没有特定的数学名称。importtorch#arange创建一个行向量x，这个行向量包含以0开始的前12个整数。x=torch.arange(12)print("x=",x)#x=tensor([0,1,2
根茎式装配体（RA）作为下一代协同智能范式的理论、架构与应用由数入道人工智能思维框架软件工程智能体
一、引言——范式危机与新大陆的召唤1.1表征主义的黄昏：当前AI协同范式的认知天花板自艾伦·图灵在《计算机器与智能》中播下思想的种子以来，人工智能的漫长征途始终被一个强大而内隐的哲学范式所笼罩——我们称之为“表征主义”（Representationism）。这一范式，无论其外在形态如何演变，从早期的符号逻辑、专家系统，到如今风靡全球的深度学习神经网络，其核心信念从未动摇：智能的核心，在于构建一个关
【零基础学AI】第36讲：GPT模型原理 1989 0基础学AI 人工智能 gpt lstm rnn YOLO 目标检测
本节课你将学到理解GPT模型的基本原理掌握Transformer解码器的工作机制实现一个简单的文本生成应用开始之前环境要求Python3.8+安装包：pipinstalltransformerstorch硬件：CPU即可运行（GPU可加速）前置知识了解基本的神经网络概念（第23讲内容）熟悉Python编程基础核心概念什么是GPT？GPT（GenerativePre-trainedTransform
SQL的各种连接查询 xieke90 UNION ALL UNION 外连接内连接 JOIN
一、内连接概念：内连接就是使用比较运算符根据每个表共有的列的值匹配两个表中的行。内连接（join 或者inner join ） SQL语法： select * fron
java编程思想--复用类百合不是茶 java 继承代理组合 final类
复用类看着标题都不知道是什么,再加上java编程思想翻译的比价难懂,所以知道现在才看这本软件界的奇书一:组合语法:就是将对象的引用放到新类中即可代码: package com.wj.reuse; /** * * @author Administrator 组
[开源与生态系统]国产CPU的生态系统 comsci cpu
计算机要从娃娃抓起...而孩子最喜欢玩游戏.... 要让国产CPU在国内市场形成自己的生态系统和产业链,国家和企业就不能够忘记游戏这个非常关键的环节.... 投入一些资金和资源,人力和政策,让游
JVM内存区域划分Eden Space、Survivor Space、Tenured Gen，Perm Gen解释商人shang jvm内存
jvm区域总体分两类，heap区和非heap区。heap区又分：Eden Space（伊甸园）、Survivor Space(幸存者区)、Tenured Gen（老年代-养老区）。非heap区又分：Code Cache(代码缓存区)、Perm Gen（永久代）、Jvm Stack(java虚拟机栈)、Local Method Statck(本地方法栈)。 HotSpot虚拟机GC算法采用分代收
页面上调用 QQ oloz qq
<A href="tencent://message/?uin=707321921&Site=有事Q我&Menu=yes"> <img style="border:0px;" src=http://wpa.qq.com/pa?p=1:707321921:1></a>
一些问题文强chu 问题
1.eclipse 导出 doc 出现“The Javadoc command does not exist.” javadoc command 选择 jdk/bin/javadoc.exe 2.tomcate 配置 web 项目 ..... SQL:3.mysql * 必须得放前面否则 select&nbs
生活没有安全感小桔子生活孤独安全感
圈子好小，身边朋友没几个，交心的更是少之又少。在深圳，除了男朋友，没几个亲密的人。不知不觉男朋友成了唯一的依靠，毫不夸张的说，业余生活的全部。现在感情好，也很幸福的。但是说不准难免人心会变嘛，不发生什么大家都乐融融，发生什么很难处理。我想说如果不幸被分手(无论原因如何)，生活难免变化很大，在深圳，我没交心的朋友。明
php 基础语法 aichenglong php 基本语法
1 .1 php变量必须以$开头 <?php $a=” b”; echo ?> 1 .2 php基本数据库类型 Integer float/double Boolean string 1 .3 复合数据类型数组array和对象 object 1 .4 特殊数据类型 null 资源类型(resource) $co
mybatis tools 配置详解 AILIKES mybatis
MyBatis Generator中文文档 MyBatis Generator中文文档地址： http://generator.sturgeon.mopaas.com/ 该中文文档由于尽可能和原文内容一致，所以有些地方如果不熟悉，看中文版的文档的也会有一定的障碍，所以本章根据该中文文档以及实际应用，使用通俗的语言来讲解详细的配置。本文使用Markdown进行编辑，但是博客显示效
继承与多态的探讨百合不是茶 JAVA面向对象继承对象
继承 extends 多态继承是面向对象最经常使用的特征之一：继承语法是通过继承发、基类的域和方法 //继承就是从现有的类中生成一个新的类，这个新类拥有现有类的所有extends是使用继承的关键字：在A类中定义属性和方法； class A{ //定义属性 int age； //定义方法 public void go
JS的undefined与null的实例 bijian1013 JavaScript JavaScript
<form name="theform" id="theform"> </form> <script language="javascript"> var a alert(typeof(b)); //这里提示undefined if(theform.datas
TDD实践（一） bijian1013 java 敏捷 TDD
一.TDD概述 TDD：测试驱动开发，它的基本思想就是在开发功能代码之前，先编写测试代码。也就是说在明确要开发某个功能后，首先思考如何对这个功能进行测试，并完成测试代码的编写，然后编写相关的代码满足这些测试用例。然后循环进行添加其他功能，直到完全部功能的开发。
[Maven学习笔记十]Maven Profile与资源文件过滤器 bit1129 maven
什么是Maven Profile Maven Profile的含义是针对编译打包环境和编译打包目的配置定制，可以在不同的环境上选择相应的配置，例如DB信息，可以根据是为开发环境编译打包，还是为生产环境编译打包，动态的选择正确的DB配置信息 Profile的激活机制 1.Profile可以手工激活，比如在Intellij Idea的Maven Project视图中可以选择一个P
【Hive八】Hive用户自定义生成表函数(UDTF) bit1129 hive
1. 什么是UDTF UDTF，是User Defined Table-Generating Functions，一眼看上去，貌似是用户自定义生成表函数，这个生成表不应该理解为生成了一个HQL Table，貌似更应该理解为生成了类似关系表的二维行数据集 2. 如何实现UDTF 继承org.apache.hadoop.hive.ql.udf.generic
tfs restful api 加auth 2.0认计 ronin47
　　目前思考如何给tfs的ngx-tfs api增加安全性。有如下两点：　　一是基于客户端的ip设置。这个比较容易实现。　　二是基于OAuth2.0认证，这个需要lua，实现起来相对于一来说，有些难度。　　现在重点介绍第二种方法实现思路。　　前言：我们使用Nginx的Lua中间件建立了OAuth2认证和授权层。如果你也有此打算，阅读下面的文档，实现自动化并获得收益。SeatGe
jdk环境变量配置 byalias java jdk
进行java开发，首先要安装jdk，安装了jdk后还要进行环境变量配置： 1、下载jdk（http://java.sun.com/javase/downloads/index.jsp），我下载的版本是：jdk-7u79-windows-x64.exe 2、安装jdk-7u79-windows-x64.exe 3、配置环境变量：右击"计算机"-->&quo
《代码大全》表驱动法-Table Driven Approach-2 bylijinnan java
package com.ljn.base; import java.io.BufferedReader; import java.io.FileInputStream; import java.io.InputStreamReader; import java.util.ArrayList; import java.util.Collections; import java.uti
SQL 数值四舍五入小数点后保留2位 chicony 四舍五入
1.round() 函数是四舍五入用，第一个参数是我们要被操作的数据，第二个参数是设置我们四舍五入之后小数点后显示几位。 2.numeric 函数的2个参数，第一个表示数据长度，第二个参数表示小数点后位数。例如：　　select cast(round(12.5,2) as numeric(5,2))
c++运算符重载 CrazyMizzz C++
一、加+，减-，乘*，除/ 的运算符重载 Rational operator*(const Rational &x) const{ return Rational(x.a * this->a); } 在这里只写乘法的，加减除的写法类似二、<<输出,>>输入的运算符重载 &nb
hive DDL语法汇总 daizj hive 修改列 DDL 修改表
hive DDL语法汇总１、对表重命名 hive> ALTER TABLE table_name RENAME TO new_table_name; 2、修改表备注 hive> ALTER TABLE table_name SET TBLPROPERTIES ('comment' = new_comm
jbox使用说明 dcj3sjt126com Web
参考网址：http://www.kudystudio.com/jbox/jbox-demo.html jBox v2.3 beta [ 点击下载] 技术交流QQGroup：172543951 100521167 [2011-11-11] jBox v2.3 正式版 - [调整&修复] IE6下有iframe或页面有active、applet控件
UISegmentedControl 开发笔记 dcj3sjt126com
// typedef NS_ENUM(NSInteger, UISegmentedControlStyle) { // UISegmentedControlStylePlain, // large plain &
Slick生成表映射文件 ekian scala
Scala添加SLICK进行数据库操作，需在sbt文件上添加slick-codegen包 "com.typesafe.slick" %% "slick-codegen" % slickVersion 因为我是连接SQL Server数据库，还需添加slick-extensions，jtds包 "com.typesa
ES-TEST gengzg test
package com.MarkNum; import java.io.IOException; import java.util.Date; import java.util.HashMap; import java.util.Map; import javax.servlet.ServletException; import javax.servlet.annotation
为何外键不再推荐使用 hugh.wang mysql DB
表的关联，是一种逻辑关系，并不需要进行物理上的“硬关联”，而且你所期望的关联，其实只是其数据上存在一定的联系而已，而这种联系实际上是在设计之初就定义好的固有逻辑。在业务代码中实现的时候，只要按照设计之初的这种固有关联逻辑来处理数据即可，并不需要在数据库层面进行“硬关联”，因为在数据库层面通过使用外键的方式进行“硬关联”，会带来很多额外的资源消耗来进行一致性和完整性校验，即使很多时候我们并不
领域驱动设计 julyflame VO DAO 设计模式 DTO po
概念： VO（View Object）：视图对象，用于展示层，它的作用是把某个指定页面（或组件）的所有数据封装起来。 DTO（Data Transfer Object）：数据传输对象，这个概念来源于J2EE的设计模式，原来的目的是为了EJB的分布式应用提供粗粒度的数据实体，以减少分布式调用的次数，从而提高分布式调用的性能和降低网络负载，但在这里，我泛指用于展示层与服务层之间的数据传输对
单例设计模式 hm4123660 java Singleton 单例设计模式懒汉式饿汉式
单例模式是一种常用的软件设计模式。在它的核心结构中只包含一个被称为单例类的特殊类。通过单例模式可以保证系统中一个类只有一个实例而且该实例易于外界访问，从而方便对实例个数的控制并节约系统源。如果希望在系统中某个类的对象只能存在一个，单例模式是最好的解决方案。 &nb
logback zhb8015 log logback
一、logback的介绍 Logback是由log4j创始人设计的又一个开源日志组件。logback当前分成三个模块：logback-core,logback- classic和logback-access。logback-core是其它两个模块的基础模块。logback-classic是log4j的一个改良版本。此外logback-class
整合Kafka到Spark Streaming——代码示例和挑战 Stark_Summer spark storm zookeeper PARALLELISM processing
作者Michael G. Noll是瑞士的一位工程师和研究员，效力于Verisign，是Verisign实验室的大规模数据分析基础设施（基础Hadoop）的技术主管。本文，Michael详细的演示了如何将Kafka整合到Spark Streaming中。期间， Michael还提到了将Kafka整合到 Spark Streaming中的一些现状，非常值得阅读，虽然有一些信息在Spark 1.2版
spring-master-slave-commondao 王新春 DAO spring dataSource slave master
互联网的web项目，都有个特点：请求的并发量高，其中请求最耗时的db操作，又是系统优化的重中之重。为此，往往搭建 db的一主多从库的数据库架构。作为web的DAO层，要保证针对主库进行写操作，对多个从库进行读操作。当然在一些请求中，为了避免主从复制的延迟导致的数据不一致性，部分的读操作也要到主库上。（这种需求一般通过业务垂直分开，比如下单业务的代码所部署的机器，读去应该也要从主库读取数