【基础篇】pytorch学习笔记(四)[nn.Conv1d、nn.Conv2d、nn.Conv3d]

梳理一下1d,2d,3d卷积的用法
import torch.nn as nn

一、nn.Conv1d

1. 定义

class in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros')

2. 参数解释

参数名 含义 默认值
in_channels(int) 输入信号的通道数, 文本应用中为词向量的维度 -
out_channels(int) 卷积产生的通道,卷积核数 -
kernel_size(int or tuple) 卷积核尺寸(实际为kernel_size * in_channel) -
stride(int or tuple, optional) 步长 1
padding (int or tuple, optional) 输入两侧0补充的层数 0
dilation(int or tuple, optional) 卷积核元素之间的间距 1
groups(int, optional) 从输入通道到输出通道的阻塞连接数 1
bias(bool. optional) true为添加偏置 True
padding_mode (string, optional) ‘zeros’, ‘reflect’, ‘replicate’ or ‘circular’ ‘zeros’

3. 输入输出格式
输入:(batch_size, word_vector, sequence_len),即(N, C_in, L_in)
输出:(N, C_out, L_out)
L_out ={ [ L_in + 2 x padding - dilation x (kernel_size - 1) -1] / stride }+1

默认情况下:L_out = L_in - kernel_size + 1

5. 示例
(batch_size, 256, 35)
【基础篇】pytorch学习笔记(四)[nn.Conv1d、nn.Conv2d、nn.Conv3d]_第1张图片

conv1 = nn.Conv1d(in_channels=256,out_channels=100,kernel_size=2)

#批大小为32,句子的最大长度为35,词向量维度为256, 32*35*256
input = torch.randn(32,35,256)
#因为一维卷积是在最后维度上扫,所以做如下转换
#批大小为32,词向量维度为256,句子的最大长度为35, 32*256*35
input = input.permute(0,2,1)
#out: 32*100*(35-2+1)=32*100*34
out = conv1(input)
print(out.size())

输出:

torch.Size([32, 100, 34])

二、nn.Conv2d

1. 定义

class torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros')

2. 参数解释
与conv1d相似

3.输入输出格式
输入:(N, C_in, H_in, W_in)
输出:(N, C_in, H_out, W_out)
H_out ={ [ H_in + 2 × padding[0] − dilation[0] × ( kernel_size[0] − 1)−1]/ stride[0] }+1
W_out ={ [ W_in + 2 × padding[1] − dilation[1] × ( kernel_size[1] − 1)−1]/ stride[1] }+1

4. 示例

# With square kernels and equal stride
>>> m = nn.Conv2d(16, 33, 3, stride=2)
>>> # non-square kernels and unequal stride and with padding
>>> m = nn.Conv2d(16, 33, (3, 5), stride=(2, 1), padding=(4, 2))
>>> # non-square kernels and unequal stride and with padding and dilation
>>> m = nn.Conv2d(16, 33, (3, 5), stride=(2, 1), padding=(4, 2), dilation=(3, 1))
>>> input = torch.randn(20, 16, 50, 100) #batch_size * inchannel * height * width
>>> output = m(input)

5. 以一张图片在CNN中训练的部分过程为例计算size

  • 图像大小32*32,batch_size=2
  • 第一层卷积nn.Conv2d(1, 6, 5)
  • 第二层卷积nn.Con*6v2d(6, 16, 5)
  • 池化层核大小为2
  • 输入:(2,1,32,32)
  • 第一层卷积后:(2,6,28,28) 28 = 32 - 5 + 1
  • 第一次max池化后:(2,6,14,14)14 = 28/2
  • 第二层卷积后:(2,16x6x6,10,10)
  • 第二次max池化后:(2,16x6x6,5,5)
  • 第一层FC前:(2,16x6x6x5x5)

三、nn.Conv3d

1. 定义

class torch.nn.Conv3d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros')

2. 参数解释
含义大同小异

3. 输入输出格式
(N, C_in, D, H, W)
(N,C_out, D_out, H, W)
计算公式在官方文档中搜索conv3d可查看:nn.conv3d

4.示例
示例1

import torch
import torch.nn as nn
from torch import autograd
# 输入形状是(7,60,40),7表示输入的图像帧数,60和40分别是高和宽。
# kernel_size[0]即3是每次处理的图像帧数,(7,7)是卷积核的大小
m = nn.Conv3d(3, 3, (3, 7, 7), stride=1, padding=0)
input = autograd.Variable(torch.randn(1, 3, 7, 60, 40))
output = m(input)
print(output.size())
# 输出 torch.Size([1, 3, 5, 54, 34])

示例2

>>> # With square kernels and equal stride
>>> m = nn.Conv3d(16, 33, 3, stride=2)
>>> # non-square kernels and unequal stride and with padding
>>> m = nn.Conv3d(16, 33, (3, 5, 2), stride=(2, 1, 1), padding=(4, 2, 0))
>>> input = torch.randn(20, 16, 10, 50, 100)
>>> output = m(input)

你可能感兴趣的:(python,卷积,python,卷积神经网络,Pytorch,深度学习)