model一般继承nn.Model 他的实例一般具有几个有序字典,
_modules,_parameters,_buffers,表示当前model的子模块,自己注册的parameters和buffers
注意,_modules字典keys对应子模块名字,value对应子模块的实例,所以可以迭代的调用子模块的子模块,比如下面两个函数
model._modules["blocks"]._modules["0"]._modules["attn"]._modules["qkv"]._parameters.keys()#odict_keys(['weight', 'bias'])
model._modules["blocks"]._modules["0"]._modules["attn"]._modules["qkv"]._buffers.keys()#odict_keys(['weight_mask'])
因为是字典,所以可以用 keys() value() items()方法
比如model._modules.items()就
是一个包含模型所有子模块的迭代器
对于生成器,我们需要用循环或者next()来获取数据,或者list/dict()转化为ist/dict
什么是生成器,迭代器,可迭代对象,见
一文看懂python的迭代器和可迭代对象 - 知乎 (zhihu.com)
Python迭代器和生成器详解 - 知乎 (zhihu.com)
model._buffers#OrderedDict()
model.buffers()#
list(model.buffers())[0].size()#torch.Size([2304, 768])
type(list(model.named_buffers())[0])#tuple
list(model.named_buffers())[0][0]#'blocks.0.attn.head_mask'
dict(model.named_buffers()).keys()
dict(model.buffers())#ValueError: dictionary update sequence element #0 has length 2304; 2 is required
len(list(model.buffers()))#12
# modules() 强制遍历
model.named_modules()/ model.modules()
model.modules()迭代遍历模型的所有子层,包括子层的子层
def named_modules(self, memo: Optional[Set['Module']] = None, prefix: str = '', remove_duplicate: bool = True):
r"""Returns an iterator over all modules in the network, yielding
both the name of the module as well as the module itself.
Args:
memo: a memo to store the set of modules already added to the result
prefix: a prefix that will be added to the name of the module
remove_duplicate: whether to remove the duplicated module instances in the result
or not
Yields:
(str, Module): Tuple of name and module
Note:
Duplicate modules are returned only once. In the following
example, ``l`` will be returned only once.
Example::
>>> l = nn.Linear(2, 2)
>>> net = nn.Sequential(l, l)
>>> for idx, m in enumerate(net.named_modules()):
... print(idx, '->', m)
0 -> ('', Sequential(
(0): Linear(in_features=2, out_features=2, bias=True)
(1): Linear(in_features=2, out_features=2, bias=True)
))
1 -> ('0', Linear(in_features=2, out_features=2, bias=True))
"""
if memo is None:
memo = set()
if self not in memo:
if remove_duplicate:
memo.add(self)
yield prefix, self
for name, module in self._modules.items():
if module is None:
continue
submodule_prefix = prefix + ('.' if prefix else '') + name
for m in module.named_modules(memo, submodule_prefix, remove_duplicate):
yield m
前者多返回一个参数名称,这样有利于访问和初始化或修改参数
for name, layer in model.named_modules():
if 'conv' in name:
对layer进行处理
#当然,在没有返回名字的情形中,采用isinstance()函数也可以完成上述操作
for layer in model.modules():
if isinstance(layer, nn.Conv2d):
对layer进行处理
# children()只取子层
model.named_children()/model.children()
model.children()只会遍历模型的子层,不会子层的子层遍历
def named_children(self) -> Iterator[Tuple[str, 'Module']]:
r"""Returns an iterator over immediate children modules, yielding both
the name of the module as well as the module itself.
Yields:
(str, Module): Tuple containing a name and child module
Example::
>>> # xdoctest: +SKIP("undefined vars")
>>> for name, module in model.named_children():
>>> if name in ['conv4', 'conv5']:
>>> print(module)
"""
memo = set()
for name, module in self._modules.items():
if module is not None and module not in memo:
memo.add(module)
yield name, module
# parameters() 只提供可优化的参数,recurse = True 默认迭代
model.named_parameters()/model.parameters()
迭代地返回模型的所有参数,包括自己注册的
# buffers() 只提供不可优化的参数,recurse = True 默认迭代
model.named_buffers()/ model.buffers()
model._buffers#OrderedDict()
model.buffers()#
list(model.buffers())[0].size()#torch.Size([2304, 768])
type(list(model.named_buffers())[0])#tuple
list(model.named_buffers())[0][0]#'blocks.0.attn.head_mask'
dict(model.named_buffers()).keys()
dict(model.buffers())#ValueError: dictionary update sequence element #0 has length 2304; 2 is required
len(list(model.buffers()))#12
#model._parameters.keys()#odict_keys(['cls_token', 'pos_embed'])
def buffers(self, recurse: bool = True) -> Iterator[Tensor]:
for _, buf in self.named_buffers(recurse=recurse):
yield buf
def named_buffers(self, prefix: str = '', recurse: bool = True, remove_duplicate: bool = True) -> Iterator[Tuple[str, Tensor]]:
r"""Returns an iterator over module buffers, yielding both the
name of the buffer as well as the buffer itself
"""
gen = self._named_members(
lambda module: module._buffers.items(),
prefix=prefix, recurse=recurse, remove_duplicate=remove_duplicate)
yield from gen
>>> # recurse = True 默认迭代
>>> for name, buf in self.named_buffers():
>>> if name in ['running_var']:
>>> print(buf.size())
# state_dict字典 返回包括bufferss
model.state_dict()
model.state_dict()返回的是一个字典
包括所有参数
一个有序字典,该字典的键即为模型定义中有可学习参数的层的名称+weight或+bias,值则对应相应的权重或偏差,无参数的层则不在其中
包括para和buffers???
model.state_dict()直接返回模型的字典,和前面几个方法不同的是这里不需要迭代,它本身就是一个字典,可以直接通过修改state_dict来修改模型各层的参数,用于参数剪枝特别方便。详细的state_dict方法(24条消息) PyTorch模型保存深入理解_Ciao112的博客-CSDN博客