手写 CPU 卷积核加速神经网络计算(1)——naive 实现 卷积、池化、激活、全连接、批归一化(python 实现)
1Conv2ddefconv2d(input_numpy,kernel_weight_numpy,kernel_bias_numpy,padding=0):B,Ci,Hi,Wi=input_numpy.shapeinput_pad_numpy=torch.zeros(B,Ci,Hi+2*padding,Wi+2*padding)ifpadding>0:input_pad_numpy[:,:,pad