Pytorch基本操作(6)——神经网络基本骨架、卷积操作与卷积层

1 前言

在学习李沐在B站发布的《动手学深度学习》PyTorch版本教学视频中发现在操作使用PyTorch方面有许多地方看不懂,往往只是“动手”了,没有动脑。所以打算趁着寒假的时间好好恶补、整理一下PyTorch的操作,以便跟上课程。

学习资源:

  • B站up主:我是土堆的视频:PyTorch深度学习快速入门教程(绝对通俗易懂!)【小土堆】
  • PyTorch中文手册:(pytorch handbook)
  • Datawhale开源内容:深入浅出PyTorch(thorough-pytorch)

2 搭建网络

在PyTorch官网文档查看torch.nn

Pytorch基本操作(6)——神经网络基本骨架、卷积操作与卷积层_第1张图片

Pytorch基本操作(6)——神经网络基本骨架、卷积操作与卷积层_第2张图片

2.1 重写网络nn.Module

  • 重写__init__以及forward()
import torch
from torch import nn
from torch.nn import Conv2d
from torch.utils.data import DataLoader
class Tudui(nn.Module):
    def __init__(self):
        super(Tudui, self).__init__()
        
    def forward(self, input):
        output = input + 1
        return output
    
tudui = Tudui()
x = torch.tensor(1.0)
output = tudui(x)
print(output)
tensor(2.)

3 卷积操作torch.nn.functional.conv2d()

3.1 手动计算

3.1.1 stride

Pytorch基本操作(6)——神经网络基本骨架、卷积操作与卷积层_第3张图片

3.1.2 padding

Pytorch基本操作(6)——神经网络基本骨架、卷积操作与卷积层_第4张图片

3.2 代码计算F.conv2d()

  • F.conv2d(input, weight, bias=None, stride=1, padding=0, dilation=1, groups=1)-> Tensor

    1. input: input tensor of shape (minibatch,in_channels,iH,iW) 是一个4维的张量
    2. padding: padding的取值代表着在input外围补几圈0
import torch.nn.functional as F
import torchvision
input = torch.tensor([[1, 2, 0, 3, 1],
                      [0, 1, 2, 3, 1], 
                      [1, 2, 1, 0, 0], 
                      [5, 2, 3, 1, 1], 
                      [2, 1, 0, 1, 1]])
                      
kernel = torch.tensor([[1, 2, 1], 
                       [0, 1, 0], 
                       [2, 1, 0]])
print(input.shape)
print(kernel.shape)
 
input = torch.reshape(input, (1, 1, 5, 5)) # F.conv2d()中的input和output需要一个4维的张量
kernel = torch.reshape(kernel, (1, 1, 3, 3))
print(input.shape)
print(kernel.shape)
torch.Size([5, 5])
torch.Size([3, 3])
torch.Size([1, 1, 5, 5])
torch.Size([1, 1, 3, 3])

3.2.1 input

output = F.conv2d(input, kernel, stride = 1)
print(output)
tensor([[[[10, 12, 12],
          [18, 16, 16],
          [13,  9,  3]]]])
output2 = F.conv2d(input, kernel, stride = 2)
print(output2)
tensor([[[[10, 12],
          [13,  3]]]])

3.2.2 padding

output3 = F.conv2d(input, kernel, stride = 1, padding = 1)
print(output3)
tensor([[[[ 1,  3,  4, 10,  8],
          [ 5, 10, 12, 12,  6],
          [ 7, 18, 16, 16,  8],
          [11, 13,  9,  3,  4],
          [14, 13,  9,  7,  4]]]])

4 卷积层Convolution Layers

4.1 torch.nn.Conv2d()

PyTorch官方文档

参数:

  • in_channels (int) – 输入图像的通道数(如RGB图像就是三个通道)

  • out_channels (int)

  • kernel_size (int or tuple) – 卷积核的大小

  • stride (int or tuple, optional) – 卷积操作的移动步长默认为1

  • padding (int, tuple or str, optional) – 像输入的四周进行填充填充. Default: 0

  • padding_mode (string, optional) – ‘zeros’, ‘reflect’, ‘replicate’ or ‘circular’. Default: ‘zeros’

  • dilation (int or tuple, optional) – Spacing between kernel elements. Default: 1

  • groups (int, optional) – Number of blocked connections from input channels to output channels. Default: 1

  • bias (bool, optional) – If True, adds a learnable bias to the output. Default: True

更多图形示例

dataset = torchvision.datasets.CIFAR10('../dataset', train = False, transform = torchvision.transforms.ToTensor(), download = True)
Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to ../dataset\cifar-10-python.tar.gz


98.1%IOPub message rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_msg_rate_limit`.

Current values:
NotebookApp.iopub_msg_rate_limit=1000.0 (msgs/sec)
NotebookApp.rate_limit_window=3.0 (secs)
dataloader = DataLoader(dataset, batch_size = 64)
class Tudui(nn.Module):
    def __init__(self):
        super(Tudui, self).__init__()
        self.conv1 = Conv2d(in_channels = 3, out_channels = 6, kernel_size = 3, stride = 1, padding = 0)
#         self.conv1 = Conv2d(3, 6)

    def forward(self, x):
        x = self.conv1(x)
        return x
tudui = Tudui()
print(tudui) # 看网络的信息
Tudui(
  (conv1): Conv2d(3, 6, kernel_size=(3, 3), stride=(1, 1))
)

看下通过卷积操作后的图片形状,批量为64,通道数由3变为6,图片宽和高各减少2

for data in dataloader:
    imgs, targets = data
    output = tudui(imgs)
    
    print(imgs.shape)
    print(output.shape)
    
    break
torch.Size([64, 3, 32, 32])
torch.Size([64, 6, 30, 30])

来更直观的看

from torch.utils.tensorboard.writer import SummaryWriter
writer = SummaryWriter("logs")

step = 0
for data in dataloader:
    imgs, targets = data
    output = tudui(imgs) # torch.Size([64, 6, 30, 30]), 通道数为6,writer不会读
    
    writer.add_images("input", imgs, step)
    
    output = torch.reshape(output, (-1, 3, 30, 30)) # 为了让变化后的图片能直观显示,这里强行让通道数为3。-1的地方会根据其他维数自动调整
    
    writer.add_images("output", output, step)
    
    step += 1

输入:

Pytorch基本操作(6)——神经网络基本骨架、卷积操作与卷积层_第5张图片

输出:

Pytorch基本操作(6)——神经网络基本骨架、卷积操作与卷积层_第6张图片

4.2 关于卷积层输出图形形状的计算

Pytorch基本操作(6)——神经网络基本骨架、卷积操作与卷积层_第7张图片

你可能感兴趣的:(pytorch,深度学习,pytorch,神经网络,深度学习)