model.named_modules(),不但返回模型的所有子层,还会返回这些层的名字。这里的所有子层的意思指的是,会把可以迭代的层全部迭代一遍。
In [28]: len(model_named_modules)
Out[28]: 15
In [29]: model_named_modules
Out[29]:
[('', Net(
(features): Sequential(
(0): Conv2d(3, 6, kernel_size=(3, 3), stride=(1, 1))
(1): BatchNorm2d(6, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace)
(3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(4): Conv2d(6, 9, kernel_size=(3, 3), stride=(1, 1))
(5): BatchNorm2d(9, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(6): ReLU(inplace)
(7): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(classifier): Sequential(
(0): Linear(in_features=576, out_features=128, bias=True)
(1): ReLU(inplace)
(2): Dropout(p=0.5)
(3): Linear(in_features=128, out_features=10, bias=True)
)
)),
('features', Sequential(
(0): Conv2d(3, 6, kernel_size=(3, 3), stride=(1, 1))
(1): BatchNorm2d(6, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace)
(3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(4): Conv2d(6, 9, kernel_size=(3, 3), stride=(1, 1))
(5): BatchNorm2d(9, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(6): ReLU(inplace)
(7): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)),
('features.0', Conv2d(3, 6, kernel_size=(3, 3), stride=(1, 1))),
('features.1', BatchNorm2d(6, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)), ('features.2', ReLU(inplace)),
('features.3', MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)),
('features.4', Conv2d(6, 9, kernel_size=(3, 3), stride=(1, 1))),
('features.5', BatchNorm2d(9, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)), ('features.6', ReLU(inplace)),
('features.7', MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)),
('classifier',
Sequential(
(0): Linear(in_features=576, out_features=128, bias=True)
(1): ReLU(inplace)
(2): Dropout(p=0.5)
(3): Linear(in_features=128, out_features=10, bias=True)
)),
('classifier.0', Linear(in_features=576, out_features=128, bias=True)),
('classifier.1', ReLU(inplace)),
('classifier.2', Dropout(p=0.5)),
('classifier.3', Linear(in_features=128, out_features=10, bias=True))]
model.modules()迭代遍历模型的所有子层,所有子层即指nn.Module子类。如上一节所示
model.children()只迭代模型的最外面一层
In [22]: len(model_children)
Out[22]: 2
In [22]: model_children
Out[22]:
[Sequential(
(0): Conv2d(3, 6, kernel_size=(3, 3), stride=(1, 1))
(1): BatchNorm2d(6, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace)
(3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(4): Conv2d(6, 9, kernel_size=(3, 3), stride=(1, 1))
(5): BatchNorm2d(9, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(6): ReLU(inplace)
(7): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
),
Sequential(
(0): Linear(in_features=576, out_features=128, bias=True)
(1): ReLU(inplace)
(2): Dropout(p=0.5)
(3): Linear(in_features=128, out_features=10, bias=True)
)]
加入名字的children
迭代地返回模型的所有参数,在训练时,我们常常将model.parameters()放入优化器中,表示要优化学习的模型参数。model.parameters()是一个生成器,每个参数张量都是一个参数容器,它的对象是各个参数Tensor,在用优化器优化参数时,优化对象是纯参数,所以用model.parameters()
如果你是从前面看过来的,就会知道,这里就是迭代的返回带有名字的参数,会给每个参数加上带有.weight或.bias的名字以区分权重和偏置。
model.state_dict()直接返回模型的字典,和前面几个方法不同的是这里不需要迭代,它本身就是一个字典,可以直接通过修改state_dict来修改模型各层的参数,用于参数剪枝特别方便。在pytorch中,state_dict是一个从参数名称映射到参数Tensor的字典对象。
state_dict()返回的是一个有序字典,该字典的键即为模型定义中有可学习参数的层的名称+weight或+bias,值则对应相应的权重或偏差,无参数的层则不在其中。