在pytorch中对model进行调整有多种方法。但是总有些莫名奇妙会报错的。
下面有三种,详情见博客
pytorch中的pre-train函数模型引用及修改(增减网络层,修改某层参数等)
(继)pytorch中的pretrain模型网络结构修改
一是加载完模型后直接修改,(对于resnet比较适用,对于vgg就不能用了)比如:
model.fc = nn.Linear(fc_features, 9)
这种情况,适用于修改的层,可以由self.层的名字获取到。
如果层在sequential中。因为sequential类型没有定义setitem,只有getitem 所以不能直接获取某一层并进行修改。就是sequential[0]=nn.Linear(fc_features, 9)是会报错的。(不知道有没有别的方法。)
二是用参数覆盖的方法,即自己先定义一个类似的网络,再将预训练中的参数提取到自己的网络中来。这里以resnet预训练模型举例。
这个方法不太理解。。我还是不知道怎么用到sequential里面。。感觉改动会比较大。
通过state_dict() 去获取每一层的名字并给予权重。就是新定义的网络要注意不能和pretrained的网络有同样名字的层。
三是使用nn.module的model.children()的函数,重新定义自己model的层。这个比较灵活。
self.layer= nn.Sequential(*list(model.children())[:-2])
例如对于vgg11 我想修改成1channel输入 ,输出是100个类别的实现如下:修改和添加的代码比较少。
import torch.nn as nn
import torch.utils.model_zoo as model_zoo
import math
class VGG(nn.Module):
def __init__(self, features, num_classes=1000, init_weights=True):
super(VGG, self).__init__()
self.features = features
self.classifier = nn.Sequential(
nn.Linear(512 * 7 * 7, 4096),
nn.ReLU(True),
nn.Dropout(),
nn.Linear(4096, 4096),
nn.ReLU(True),
nn.Dropout(),
nn.Linear(4096, num_classes),
)
if init_weights:
self._initialize_weights()
def forward(self, x):
x = self.features(x)
x = x.view(x.size(0), -1)
x = self.classifier(x)
return x
def _initialize_weights(self):
for m in self.modules():
if isinstance(m, nn.Conv2d):
nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
if m.bias is not None:
nn.init.constant_(m.bias, 0)
elif isinstance(m, nn.BatchNorm2d):
nn.init.constant_(m.weight, 1)
nn.init.constant_(m.bias, 0)
elif isinstance(m, nn.Linear):
nn.init.normal_(m.weight, 0, 0.01)
nn.init.constant_(m.bias, 0)
def make_layers(cfg, batch_norm=False):
layers = []
in_channels = 3
for v in cfg:
if v == 'M':
layers += [nn.MaxPool2d(kernel_size=2, stride=2)]
else:
conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1)
if batch_norm:
layers += [conv2d, nn.BatchNorm2d(v), nn.ReLU(inplace=True)]
else:
layers += [conv2d, nn.ReLU(inplace=True)]
in_channels = v
return nn.Sequential(*layers)
#生成一个1channel输入的model
def make_one_channel_layers(cfg, batch_norm=False):
layers = []
in_channels = 1
for v in cfg:
if v == 'M':
layers += [nn.MaxPool2d(kernel_size=2, stride=2)]
else:
conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1)
if batch_norm:
layers += [conv2d, nn.BatchNorm2d(v), nn.ReLU(inplace=True)]
else:
layers += [conv2d, nn.ReLU(inplace=True)]
in_channels = v
return nn.Sequential(*layers)
cfg = {
'A': [64, 'M', 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'],
'B': [64, 64, 'M', 128, 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'],
'D': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'M', 512, 512, 512, 'M', 512, 512, 512, 'M'],
'E': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 256, 'M', 512, 512, 512, 512, 'M', 512, 512, 512, 512, 'M'],
}
def vgg11(pretrained=False, **kwargs):
"""VGG 11-layer model (configuration "A")
Args:
pretrained (bool): If True, returns a model pre-trained on ImageNet
"""
if pretrained:
kwargs['init_weights'] = False
model = VGG(make_layers(cfg['A']), **kwargs)
#输出为100个类别
mymodel=VGG(make_one_channel_layers(cfg['A']),num_classes=100, **kwargs)
if pretrained:
model.load_state_dict(model_zoo.load_url(model_urls['vgg11']))
#在预训练好的model中选择要的部分,拼接自己定义的mymodel类型部分
model.features=nn.Sequential(list(mymodel.features.children())[0],*list(model.features.children())[1:])
mymodel.classifier=nn.Sequential(*list(model.classifier.children())[:-1],list(mymodel.classifier.children())[-1])
return mymodel
呃。。。。。。。。。。。
model.features=nn.Sequential(nn.Conv2d(1, 96, kernel_size=7, stride=2),*list(model.features.children())[1:])
另外直接用
model_conv.classifier[6].out_features = Output_features
这样直接修改参数,输出模型是修改之后的,但是运行时还是会报错Given groups=1, weight[64, 3, 3, 3], so expected input[32, 1, 224, 224] to have 3 channels, but got 1 channels 这样的错。。。所以。。不知道怎么改,如果可以这样的话,就会很方便呀!!!可是报错。。。
附上部分更新模型参数的方法(新模型增加了一些层)
pretrained_dict = ...
model_dict = model.state_dict()
# 1. filter out unnecessary keys
pretrained_dict = {k: v for k, v in pretrained_dict.items() if k in model_dict}
# 2. overwrite entries in the existing state dict
model_dict.update(pretrained_dict)
# 3. load the new state dict
model.load_state_dict(model_dict)