在学习李沐在B站发布的《动手学深度学习》PyTorch版本教学视频中发现在操作使用PyTorch方面有许多地方看不懂,往往只是“动手”了,没有动脑。所以打算趁着寒假的时间好好恶补、整理一下PyTorch的操作,以便跟上课程。
学习资源:
在PyTorch官网文档查看torch.nn
nn.Module
__init__
以及forward()
import torch
from torch import nn
from torch.nn import Conv2d
from torch.utils.data import DataLoader
class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
def forward(self, input):
output = input + 1
return output
tudui = Tudui()
x = torch.tensor(1.0)
output = tudui(x)
print(output)
tensor(2.)
torch.nn.functional.conv2d()
stride
padding
F.conv2d()
F.conv2d(input, weight, bias=None, stride=1, padding=0, dilation=1, groups=1)-> Tensor
import torch.nn.functional as F
import torchvision
input = torch.tensor([[1, 2, 0, 3, 1],
[0, 1, 2, 3, 1],
[1, 2, 1, 0, 0],
[5, 2, 3, 1, 1],
[2, 1, 0, 1, 1]])
kernel = torch.tensor([[1, 2, 1],
[0, 1, 0],
[2, 1, 0]])
print(input.shape)
print(kernel.shape)
input = torch.reshape(input, (1, 1, 5, 5)) # F.conv2d()中的input和output需要一个4维的张量
kernel = torch.reshape(kernel, (1, 1, 3, 3))
print(input.shape)
print(kernel.shape)
torch.Size([5, 5])
torch.Size([3, 3])
torch.Size([1, 1, 5, 5])
torch.Size([1, 1, 3, 3])
input
output = F.conv2d(input, kernel, stride = 1)
print(output)
tensor([[[[10, 12, 12],
[18, 16, 16],
[13, 9, 3]]]])
output2 = F.conv2d(input, kernel, stride = 2)
print(output2)
tensor([[[[10, 12],
[13, 3]]]])
padding
output3 = F.conv2d(input, kernel, stride = 1, padding = 1)
print(output3)
tensor([[[[ 1, 3, 4, 10, 8],
[ 5, 10, 12, 12, 6],
[ 7, 18, 16, 16, 8],
[11, 13, 9, 3, 4],
[14, 13, 9, 7, 4]]]])
Convolution Layers
torch.nn.Conv2d()
PyTorch官方文档
参数:
in_channels
(int) – 输入图像的通道数(如RGB图像就是三个通道)
out_channels
(int)
kernel_size
(int or tuple) – 卷积核的大小
stride
(int or tuple, optional) – 卷积操作的移动步长默认为1
padding
(int, tuple or str, optional) – 像输入的四周进行填充填充. Default: 0
padding_mode
(string, optional) – ‘zeros’, ‘reflect’, ‘replicate’ or ‘circular’. Default: ‘zeros’
dilation
(int or tuple, optional) – Spacing between kernel elements. Default: 1
groups
(int, optional) – Number of blocked connections from input channels to output channels. Default: 1
bias
(bool, optional) – If True, adds a learnable bias to the output. Default: True
更多图形示例
dataset = torchvision.datasets.CIFAR10('../dataset', train = False, transform = torchvision.transforms.ToTensor(), download = True)
Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to ../dataset\cifar-10-python.tar.gz
98.1%IOPub message rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_msg_rate_limit`.
Current values:
NotebookApp.iopub_msg_rate_limit=1000.0 (msgs/sec)
NotebookApp.rate_limit_window=3.0 (secs)
dataloader = DataLoader(dataset, batch_size = 64)
class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.conv1 = Conv2d(in_channels = 3, out_channels = 6, kernel_size = 3, stride = 1, padding = 0)
# self.conv1 = Conv2d(3, 6)
def forward(self, x):
x = self.conv1(x)
return x
tudui = Tudui()
print(tudui) # 看网络的信息
Tudui(
(conv1): Conv2d(3, 6, kernel_size=(3, 3), stride=(1, 1))
)
看下通过卷积操作后的图片形状,批量为64,通道数由3变为6,图片宽和高各减少2
for data in dataloader:
imgs, targets = data
output = tudui(imgs)
print(imgs.shape)
print(output.shape)
break
torch.Size([64, 3, 32, 32])
torch.Size([64, 6, 30, 30])
来更直观的看
from torch.utils.tensorboard.writer import SummaryWriter
writer = SummaryWriter("logs")
step = 0
for data in dataloader:
imgs, targets = data
output = tudui(imgs) # torch.Size([64, 6, 30, 30]), 通道数为6,writer不会读
writer.add_images("input", imgs, step)
output = torch.reshape(output, (-1, 3, 30, 30)) # 为了让变化后的图片能直观显示,这里强行让通道数为3。-1的地方会根据其他维数自动调整
writer.add_images("output", output, step)
step += 1
输入:
输出: