CONV2D官方链接
torch.nn.Conv2d(
in_channels,
out_channels,
kernel_size,
stride=1,
padding=0,
dilation=1,
groups=1,
bias=True,
padding_mode='zeros',
device=None,
dtype=None
)
在这段函数中,输入为 ( N , C i n , H , W ) (N,C_{in},H,W) (N,Cin,H,W),输出为 ( N , C o u t , H o u t , W o u t ) (N,C_{out},H_{out},W_{out}) (N,Cout,Hout,Wout),它们的关系为:
out ( N i , C out j ) = bias ( C out j ) + ∑ k = 0 C i n − 1 weight ( C out j , k ) ⋆ input ( N i , k ) \operatorname{out}\left(N_i, C_{\text {out }_j}\right)=\operatorname{bias}\left(C_{\text {out }_j}\right)+\sum_{k=0}^{C_{\mathrm{in}}-1} \operatorname{weight}\left(C_{\text {out }_j}, k\right) \star \operatorname{input}\left(N_i, k\right) out(Ni,Cout j)=bias(Cout j)+k=0∑Cin−1weight(Cout j,k)⋆input(Ni,k)
其中 N 为 batch size,C 为输入通道数,H 为图像高,W 为图像宽。
输入可以为: ( N , C i n , H i n , W i n ) (N,C_{in},H_{in},W_{in}) (N,Cin,Hin,Win) 或 ( C i n , H i n , W i n ) (C_{in},H_{in},W_{in}) (Cin,Hin,Win)
输出可以为: ( N , C o u t , H o u t , W o u t ) (N,C_{out},H_{out},W_{out}) (N,Cout,Hout,Wout) 或 ( C o u t , H o u t , W o u t ) (C_{out},H_{out},W_{out}) (Cout,Hout,Wout)
它们之间的关系为:
H o u t = ⌊ H i n + 2 × p a d d i n g [ 0 ] − d i l a t i o n [ 0 ] × ( k e r n e l _ s i z e [ 0 ] − 1 ) − 1 s t r i d e [ 0 ] + 1 ⌋ H_{out}=\left\lfloor\frac{H_{in}+2 \times padding[0]-dilation[0] \times(kernel\_size[0]-1)-1}{ stride [0]}+1\right\rfloor Hout=⌊stride[0]Hin+2×padding[0]−dilation[0]×(kernel_size[0]−1)−1+1⌋
W o u t = ⌊ W i n + 2 × p a d d i n g [ 1 ] − d i l a t i o n [ 1 ] × ( k e r n e l _ s i z e [ 1 ] − 1 ) − 1 s t r i d e [ 1 ] + 1 ⌋ W_{out}=\left\lfloor\frac{W_{in}+2 \times padding[1]-dilation[1] \times(kernel\_size[1]-1)-1}{ stride [1]}+1\right\rfloor Wout=⌊stride[1]Win+2×padding[1]−dilation[1]×(kernel_size[1]−1)−1+1⌋
# With square kernels and equal stride
m = nn.Conv2d(16, 33, 3, stride=2)
# non-square kernels and unequal stride and with padding
m = nn.Conv2d(16, 33, (3, 5), stride=(2, 1), padding=(4, 2))
# non-square kernels and unequal stride and with padding and dilation
m = nn.Conv2d(16, 33, (3, 5), stride=(2, 1), padding=(4, 2), dilation=(3, 1))
input = torch.randn(20, 16, 50, 100)
output = m(input)
官网链接
⭐ 区别
torch.nn.Conv2d
和 torch.nn.functional.conv2d
,在 pytorch 构建模型中,都可以作为二维卷积的引入,但前者为类模块,后者为函数,在使用上存在不同。
⭐ 使用
torch.nn.functional.conv2d(input, weight, bias=None, stride=1, padding=0, dilation=1, groups=1)