一文带你搞懂PyTorch中所有模型查看的函数model.modules()系列-Toy模板网

这篇具有很好参考价值的文章主要介绍了一文带你搞懂PyTorch中所有模型查看的函数model.modules()系列。希望对大家有所帮助。如果存在错误或未考虑完全的地方，请大家不吝赐教，您也可以点击"举报违法"按钮提交疑问。

model一般继承nn.Model 他的实例一般具有几个有序字典，

_modules，_parameters，_buffers，表示当前model的子模块，自己注册的parameters和buffers

注意，_modules字典keys对应子模块名字，value对应子模块的实例，所以可以迭代的调用子模块的子模块，比如下面两个函数

model._modules["blocks"]._modules["0"]._modules["attn"]._modules["qkv"]._parameters.keys()#odict_keys(['weight', 'bias'])

model._modules["blocks"]._modules["0"]._modules["attn"]._modules["qkv"]._buffers.keys()#odict_keys(['weight_mask'])

因为是字典，所以可以用 keys（） value() items（）方法

比如model._modules.items()就是一个包含模型所有子模块的迭代器

接下来看几个model的方法

对于生成器，我们需要用循环或者next()来获取数据，或者list/dict（）转化为ist/dict

什么是生成器，迭代器，可迭代对象，见

一文看懂python的迭代器和可迭代对象 - 知乎 (zhihu.com)

Python迭代器和生成器详解 - 知乎 (zhihu.com)

model._buffers#OrderedDict()

model.buffers()#<generator object Module.buffers at 0x7f7a80496d60>

list(model.buffers())[0].size()#torch.Size([2304, 768])

type(list(model.named_buffers())[0])#tuple

list(model.named_buffers())[0][0]#'blocks.0.attn.head_mask'

dict(model.named_buffers()).keys()

dict(model.buffers())#ValueError: dictionary update sequence element #0 has length 2304; 2 is required

len(list(model.buffers()))#12

# modules（）强制遍历

model.named_modules()/ model.modules()

model.modules()迭代遍历模型的所有子层，包括子层的子层

    def named_modules(self, memo: Optional[Set['Module']] = None, prefix: str = '', remove_duplicate: bool = True):
        r"""Returns an iterator over all modules in the network, yielding
        both the name of the module as well as the module itself.

        Args:
            memo: a memo to store the set of modules already added to the result
            prefix: a prefix that will be added to the name of the module
            remove_duplicate: whether to remove the duplicated module instances in the result
                or not

        Yields:
            (str, Module): Tuple of name and module

        Note:
            Duplicate modules are returned only once. In the following
            example, ``l`` will be returned only once.

        Example::

            >>> l = nn.Linear(2, 2)
            >>> net = nn.Sequential(l, l)
            >>> for idx, m in enumerate(net.named_modules()):
            ...     print(idx, '->', m)

            0 -> ('', Sequential(
              (0): Linear(in_features=2, out_features=2, bias=True)
              (1): Linear(in_features=2, out_features=2, bias=True)
            ))
            1 -> ('0', Linear(in_features=2, out_features=2, bias=True))

        """

        if memo is None:
            memo = set()
        if self not in memo:
            if remove_duplicate:
                memo.add(self)
            yield prefix, self
            for name, module in self._modules.items():
                if module is None:
                    continue
                submodule_prefix = prefix + ('.' if prefix else '') + name
                for m in module.named_modules(memo, submodule_prefix, remove_duplicate):
                    yield m

前者多返回一个参数名称，这样有利于访问和初始化或修改参数

for name, layer in model.named_modules():
    if 'conv' in name:
        对layer进行处理

#当然，在没有返回名字的情形中，采用isinstance()函数也可以完成上述操作
for layer in model.modules():
    if isinstance(layer, nn.Conv2d):
        对layer进行处理

# children（）只取子层

model.named_children()/model.children()

model.children()只会遍历模型的子层，不会子层的子层遍历

    def named_children(self) -> Iterator[Tuple[str, 'Module']]:
        r"""Returns an iterator over immediate children modules, yielding both
        the name of the module as well as the module itself.

        Yields:
            (str, Module): Tuple containing a name and child module

        Example::

            >>> # xdoctest: +SKIP("undefined vars")
            >>> for name, module in model.named_children():
            >>>     if name in ['conv4', 'conv5']:
            >>>         print(module)

        """
        memo = set()
        for name, module in self._modules.items():
            if module is not None and module not in memo:
                memo.add(module)
                yield name, module

# parameters（）只提供可优化的参数，recurse = True 默认迭代

model.named_parameters()/model.parameters()

迭代地返回模型的所有参数，包括自己注册的

# buffers（）只提供不可优化的参数，recurse = True 默认迭代

model.named_buffers()/ model.buffers()

model._buffers#OrderedDict()

model.buffers()#<generator object Module.buffers at 0x7f7a80496d60>

list(model.buffers())[0].size()#torch.Size([2304, 768])

type(list(model.named_buffers())[0])#tuple

list(model.named_buffers())[0][0]#'blocks.0.attn.head_mask'

dict(model.named_buffers()).keys()

dict(model.buffers())#ValueError: dictionary update sequence element #0 has length 2304; 2 is required

len(list(model.buffers()))#12

#model._parameters.keys()#odict_keys(['cls_token', 'pos_embed'])

    def buffers(self, recurse: bool = True) -> Iterator[Tensor]:
        for _, buf in self.named_buffers(recurse=recurse):
            yield buf

    def named_buffers(self, prefix: str = '', recurse: bool = True, remove_duplicate: bool = True) -> Iterator[Tuple[str, Tensor]]:
        r"""Returns an iterator over module buffers, yielding both the
        name of the buffer as well as the buffer itself

        """
        gen = self._named_members(
            lambda module: module._buffers.items(),
            prefix=prefix, recurse=recurse, remove_duplicate=remove_duplicate)
        yield from gen

>>> # recurse = True 默认迭代

>>> for name, buf in self.named_buffers():

>>> if name in ['running_var']:

>>> print(buf.size())

# state_dict字典返回包括bufferss

model.state_dict()

model.state_dict()返回的是一个字典

包括所有参数

一个有序字典，该字典的键即为模型定义中有可学习参数的层的名称+weight或+bias，值则对应相应的权重或偏差，无参数的层则不在其中

包括para和buffers？？？

model.state_dict()直接返回模型的字典，和前面几个方法不同的是这里不需要迭代，它本身就是一个字典，可以直接通过修改state_dict来修改模型各层的参数，用于参数剪枝特别方便。详细的state_dict方法(24条消息) PyTorch模型保存深入理解_Ciao112的博客-CSDN博客文章来源地址https://www.toymoban.com/news/detail-738238.html

到了这里，关于一文带你搞懂PyTorch中所有模型查看的函数model.modules()系列的文章就介绍完了。如果您还想了解更多内容，请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章，希望大家以后多多支持TOY模板网！