(1)卷积运算:卷积核在输入信号(图像)上滑动,相应位置上进行乘加
(2)卷积核:又称为滤波器,过滤器,可认为是某种模式,某种特征。
(3)卷积过程类似于用一个模版去图像上寻找与它相似的区域,与卷积核模式越相似,激活值越高,从而实现特征提取。
(4)AlexNet卷积核可视化,发现卷积核学习到的是边缘,条纹,色彩这一些细节模式
卷积维度:一般情况下,卷积核在几个维度上滑动,就是几维卷积
1d卷积
2d卷积:
https://mlnotebook.github.io/post/CNN1/
功能:对多个二维信号进行二维卷积
卷积维度:一般情况下,卷积核在几个维度上滑动,就是几维卷积
import os
import torch.nn as nn
from PIL import Image
from torchvision import transforms
from matplotlib import pyplot as plt
path_tools = os.path.join("tools","common_tools.py")
# print(path_tools)
from tools.common_tools import transform_invert,set_seed
set_seed(3)
# 加载图像
img = Image.open('lena.png').convert('RGB') # 0~255
plt.imshow(img)
# 转换成tensor向量
img_transform = transforms.Compose([transforms.ToTensor()])
img_tensor = img_transform(img)
# 在dim=0维度增加一个批次
# 因为卷积的输入tensor必须是四维的:
# input输入:(N, C, H, W)
img_tensor.unsqueeze_(dim=0) # C*H*W to B*C*H*W
# 创建卷积层
flag = 1
# flag = 0
if flag:
conv_layer = nn.Conv2d(3,1,3) # input:(i, o, size) weights:(o, i , h, w)
# 参数初始化
nn.init.xavier_normal_(conv_layer.weight.data)
img_conv = conv_layer(img_tensor)
# 可视化
print("卷积前尺寸:{}\n卷积后尺寸:{}".format(img_tensor.shape,img_conv.shape))
img_conv = transform_invert(img_conv[0,0:1,...],img_transform)
img_raw = transform_invert(img_tensor.squeeze(),img_transform)
plt.subplot(122).imshow(img_conv,cmap='gray')
plt.subplot(121).imshow(img_raw)
plt.show()
卷积前尺寸:torch.Size([1, 3, 512, 512])
卷积后尺寸:torch.Size([1, 1, 510, 510])
转置卷积又称为反卷积(Deconvolution)和部分跨越卷积(Fractionally-strided Convolution) ,用于对图像进行上采样(UpSample)
参考知乎大佬们:怎样通俗理解反卷积?
flag = 1
if flag:
# 设置一个stride=2
conv_layer = nn.ConvTranspose2d(3,1,3,stride=2) # input:(i, o, size)
# 参数初始化
nn.init.xavier_normal_(conv_layer.weight.data)
img_conv = conv_layer(img_tensor)
# 可视化
print("卷积前尺寸:{}\n卷积后尺寸:{}".format(img_tensor.shape,img_conv.shape))
img_conv = transform_invert(img_conv[0,0:1,...],img_transform)
img_raw = transform_invert(img_tensor.squeeze(),img_transform)
plt.subplot(122).imshow(img_conv,cmap='gray')
plt.subplot(121).imshow(img_raw)
plt.show()
卷积前尺寸:torch.Size([1, 3, 512, 512])
卷积后尺寸:torch.Size([1, 1, 1025, 1025])
# help(nn.Conv2d)
help(nn.init.xavier_normal_)
Help on function xavier_normal_ in module torch.nn.init:
xavier_normal_(tensor, gain=1.0)
Fills the input `Tensor` with values according to the method
described in `Understanding the difficulty of training deep feedforward
neural networks` - Glorot, X. & Bengio, Y. (2010), using a normal
distribution. The resulting tensor will have values sampled from
:math:`\mathcal{N}(0, \text{std}^2)` where
.. math::
\text{std} = \text{gain} \times \sqrt{\frac{2}{\text{fan\_in} + \text{fan\_out}}}
Also known as Glorot initialization.
Args:
tensor: an n-dimensional `torch.Tensor`
gain: an optional scaling factor
Examples:
>>> w = torch.empty(3, 5)
>>> nn.init.xavier_normal_(w)
import torch
w = torch.empty(3,5)
print(w)
nn.init.xavier_normal_(w)
print(w)
tensor([[0.1107, 0.1588, 0.7464, 0.8560, 0.2025],
[0.1505, 0.2469, 0.3596, 0.3466, 0.1254],
[0.7501, 0.0495, 0.5417, 0.2369, 0.4551]])
tensor([[ 1.1172, 1.2298, 1.2443, -0.0800, 0.0806],
[ 0.4042, -0.5826, 0.1932, -0.6775, 0.2642],
[-0.5724, 0.0882, -0.0440, -0.5700, 0.4701]])
上述卷积操作的图来源于:https://mlnotebook.github.io/post/CNN1/