想要尝试一下将resnet18最后一层的全连接层改成卷积层看会不会对网络效果和网络大小有什么影响
1.首先先对train.py中的更改是:
train.py代码可见:pytorch实现性别检测
# model_conv.fc = nn.Linear(fc_features, 2)这是之前的写法
model_conv.fc = nn.Conv2d(fc_features, 2, 1)
print(model_conv.fc)
但是运行的时候出错:
1)
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [2, 512, 1, 1], but got 2-dimensional input of size [4, 512] instead
[2, 512, 1, 1]为[batch_size, channels, height, width],压扁flat后为[4, 512],即[batch_size, out_size]
这是因为在传到fc层前进行了压扁的操作:
x = x.view(x.size(0), -1)
到相应的代码处/anaconda3/envs/deeplearning/lib/python3.6/site-packages/torchvision/models/resnet.py注释掉其即可
2)
Traceback (most recent call last):
File "train.py", line 192, in <module>
model_train = train_model(model_conv, criterion, optimizer_conv, exp_lr_scheduler)
File "train.py", line 135, in train_model
loss = criterion(outputs, labels)
File "/anaconda3/envs/deeplearning/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "/anaconda3/envs/deeplearning/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 904, in forward
ignore_index=self.ignore_index, reduction=self.reduction)
File "/anaconda3/envs/deeplearning/lib/python3.6/site-packages/torch/nn/functional.py", line 1970, in cross_entropy
return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
File "/anaconda3/envs/deeplearning/lib/python3.6/site-packages/torch/nn/functional.py", line 1792, in nll_loss
ret = torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError: invalid argument 3: only batches of spatial targets supported (3D tensors) but got targets of dimension: 1 at /Users/soumith/b101_2/2019_02_08/wheel_build_dirs/wheel_3.6/pytorch/aten/src/THNN/generic/SpatialClassNLLCriterion.c:59
先将得到的结果打印出来:
tensor([[[[-0.8409]],
[[ 0.3311]]],
[[[-0.3910]],
[[ 0.6904]]],
[[[-0.4417]],
[[ 0.3846]]],
[[[-1.1002]],
[[ 0.6044]]]], grad_fn=<ThnnConv2DBackward>) torch.Size([4, 2, 1, 1])
tensor([1, 1, 0, 0]) torch.Size([4])
可见得到的结果不是最后想要的结果,需要将channelheightwidth=211变为2,结果为[4,2]
然后后面回运行:
_, preds = torch.max(outputs, 1)
得到两个值中最大那个值的索引,结果的shape就会变成[4]
这里的解决办法就是在resnet.py代码的fc层下面加入一层代码:
x = x.view(x.size(0), -1)
这样最终resnet网络的forward()函数应该是:
def forward(self, x):
x = self.conv1(x)
x = self.bn1(x)
x = self.relu(x)
x = self.maxpool(x)
x = self.layer1(x)
x = self.layer2(x)
x = self.layer3(x)
x = self.layer4(x)
x = self.avgpool(x)
#x = x.view(x.size(0), -1)
x = self.fc(x)
x = x.view(x.size(0), -1)
return x
2.然后再运行即可,但是我的结果并没有很大的不同,训练的网络大小也差不多