一文带你搞懂PyTorch中所有模型查看的函数model.modules()系列

model一般继承nn.Model 他的实例一般具有几个有序字典

_modules,_parameters,_buffers,表示当前model的子模块,自己注册的parameters和buffers

注意,_modules字典keys对应子模块名字,value对应子模块的实例,所以可以迭代的调用子模块的子模块,比如下面两个函数

model._modules["blocks"]._modules["0"]._modules["attn"]._modules["qkv"]._parameters.keys()#odict_keys(['weight', 'bias'])

model._modules["blocks"]._modules["0"]._modules["attn"]._modules["qkv"]._buffers.keys()#odict_keys(['weight_mask'])

因为是字典,所以可以用 keys() value() items()方法

比如model._modules.items()就是一个包含模型所有子模块的迭代器

 

接下来看几个model的方法

对于生成器,我们需要用循环或者next()来获取数据,或者list/dict()转化为ist/dict

什么是生成器,迭代器,可迭代对象,见

一文看懂python的迭代器和可迭代对象 - 知乎 (zhihu.com)

Python迭代器和生成器详解 - 知乎 (zhihu.com)

 

model._buffers#OrderedDict()

model.buffers()#

list(model.buffers())[0].size()#torch.Size([2304, 768])

type(list(model.named_buffers())[0])#tuple

list(model.named_buffers())[0][0]#'blocks.0.attn.head_mask'

dict(model.named_buffers()).keys()

dict(model.buffers())#ValueError: dictionary update sequence element #0 has length 2304; 2 is required

len(list(model.buffers()))#12

# modules() 强制遍历

model.named_modules()/ model.modules()

model.modules()迭代遍历模型的所有子层,包括子层的子层

    def named_modules(self, memo: Optional[Set['Module']] = None, prefix: str = '', remove_duplicate: bool = True):
        r"""Returns an iterator over all modules in the network, yielding
        both the name of the module as well as the module itself.

        Args:
            memo: a memo to store the set of modules already added to the result
            prefix: a prefix that will be added to the name of the module
            remove_duplicate: whether to remove the duplicated module instances in the result
                or not

        Yields:
            (str, Module): Tuple of name and module

        Note:
            Duplicate modules are returned only once. In the following
            example, ``l`` will be returned only once.

        Example::

            >>> l = nn.Linear(2, 2)
            >>> net = nn.Sequential(l, l)
            >>> for idx, m in enumerate(net.named_modules()):
            ...     print(idx, '->', m)

            0 -> ('', Sequential(
              (0): Linear(in_features=2, out_features=2, bias=True)
              (1): Linear(in_features=2, out_features=2, bias=True)
            ))
            1 -> ('0', Linear(in_features=2, out_features=2, bias=True))

        """

        if memo is None:
            memo = set()
        if self not in memo:
            if remove_duplicate:
                memo.add(self)
            yield prefix, self
            for name, module in self._modules.items():
                if module is None:
                    continue
                submodule_prefix = prefix + ('.' if prefix else '') + name
                for m in module.named_modules(memo, submodule_prefix, remove_duplicate):
                    yield m

 前者多返回一个参数名称,这样有利于访问和初始化或修改参数

for name, layer in model.named_modules():
    if 'conv' in name:
        对layer进行处理

#当然,在没有返回名字的情形中,采用isinstance()函数也可以完成上述操作
for layer in model.modules():
    if isinstance(layer, nn.Conv2d):
        对layer进行处理

# children()只取子层

model.named_children()/model.children()

 model.children()只会遍历模型的子层,不会子层的子层遍历

    def named_children(self) -> Iterator[Tuple[str, 'Module']]:
        r"""Returns an iterator over immediate children modules, yielding both
        the name of the module as well as the module itself.

        Yields:
            (str, Module): Tuple containing a name and child module

        Example::

            >>> # xdoctest: +SKIP("undefined vars")
            >>> for name, module in model.named_children():
            >>>     if name in ['conv4', 'conv5']:
            >>>         print(module)

        """
        memo = set()
        for name, module in self._modules.items():
            if module is not None and module not in memo:
                memo.add(module)
                yield name, module

#  parameters()   只提供可优化的参数,recurse = True 默认迭代

 model.named_parameters()/model.parameters()

 迭代地返回模型的所有参数,包括自己注册的

 # buffers()   只提供不可优化的参数,recurse = True 默认迭代

 model.named_buffers()/ model.buffers()

model._buffers#OrderedDict()

model.buffers()#

list(model.buffers())[0].size()#torch.Size([2304, 768])

type(list(model.named_buffers())[0])#tuple

list(model.named_buffers())[0][0]#'blocks.0.attn.head_mask'

dict(model.named_buffers()).keys()

dict(model.buffers())#ValueError: dictionary update sequence element #0 has length 2304; 2 is required

len(list(model.buffers()))#12

#model._parameters.keys()#odict_keys(['cls_token', 'pos_embed'])

    def buffers(self, recurse: bool = True) -> Iterator[Tensor]:
        for _, buf in self.named_buffers(recurse=recurse):
            yield buf

    def named_buffers(self, prefix: str = '', recurse: bool = True, remove_duplicate: bool = True) -> Iterator[Tuple[str, Tensor]]:
        r"""Returns an iterator over module buffers, yielding both the
        name of the buffer as well as the buffer itself

        """
        gen = self._named_members(
            lambda module: module._buffers.items(),
            prefix=prefix, recurse=recurse, remove_duplicate=remove_duplicate)
        yield from gen

            >>> # recurse = True 默认迭代

            >>> for name, buf in self.named_buffers():

            >>>     if name in ['running_var']:

            >>>         print(buf.size())

# state_dict字典  返回包括bufferss

model.state_dict()

model.state_dict()返回的是一个字典

包括所有参数

一个有序字典,该字典的键即为模型定义中有可学习参数的层的名称+weight或+bias,值则对应相应的权重或偏差,无参数的层则不在其中

包括para和buffers???

model.state_dict()直接返回模型的字典,和前面几个方法不同的是这里不需要迭代,它本身就是一个字典,可以直接通过修改state_dict来修改模型各层的参数,用于参数剪枝特别方便。详细的state_dict方法(24条消息) PyTorch模型保存深入理解_Ciao112的博客-CSDN博客

你可能感兴趣的:(pytorch,人工智能,python)