pytorch详解nn.Module类,children和modules方法区别

详解nn.Module类,children和modules方法区别

pytorch里面一切自定义操作基本上都是继承nn.Module类来实现的,所以此篇文章来了解下这个核心nn.Module类。

继承nn.Module后具体实现自定义模型类时有两种方式:

(1)高层API方法:使用torch.nn.****来实现;,这些接口都是类,类可以存储参数,比如全连接层的权值矩阵、偏置矩阵等都可以作为类的属性存储着,当对这些接口类创建对象后再作为自定义模型类的属性,就实现了存储参数

(2)低层API方法:使用低层函数方法,torch.nn.functional.****来实现;,从名称就看出是一些函数接口,实现函数的运算功能,没办法保存这些信息,若是用它创建有学习参数的层,需要自己再实现保存参数的部分。

在自定义网络模型时,需要继承nn.Module类,必须重新实现构造函数__init__构造函数和forward这两个方法。但有一些注意技巧:

(1)一般把网络中具有可学习参数的层(如全连接层、卷积层等)放在构造函数__init__()方法中,当然也可以把不具有参数的层也放在__init__方法里面;

(2)不具有可学习参数的层(如ReLU、dropout、BatchNormanation层)可放在构造函数__init__中,也可不放在构造函数__init__中,如果不放在构造函数__init__里面,则在forward方法里面可以使用nn.functional来代替, 因为搭建时将没有训练参数的层 没有放在构造函数里面了(当然就没有这些属性了),所以这些层就不会出现在model里面(打印或可视化model)

(3)forward方法是必须要重写的,它是实现模型的功能,实现各个层之间的连接关系的核心。

总结:更清楚的显示或者了解我们定义的模型结构,建议将网络层(可学习参数层和没有训练参数层)都放在构造函数内实现,使之成为模型的属性,forward方法实现各层之间的连接。

补充:一般情况下,我们定义的参数是可以求导的,但是自定义操作如不可导,还需要实现backward函数。

import torch


class MyNet(torch.nn.Module):
    def __init__(self):
        # 必须调用父类的构造函数,因为想要使用父类的方法,这也是继承Module的目的
        super(MyNet, self).__init__()
        self.conv1 = torch.nn.Conv2d(3, 32, 3, 1, 1)
        self.relu1 = torch.nn.ReLU()
        self.max_pooling1 = torch.nn.MaxPool2d(2, 1)
        self.conv2 = torch.nn.Conv2d(3, 32, 3, 1, 1)
        self.relu2 = torch.nn.ReLU()
        self.max_pooling2 = torch.nn.MaxPool2d(2, 1)
        self.dense1 = torch.nn.Linear(32 * 3 * 3, 128)
        self.dense2 = torch.nn.Linear(128, 10)

    def forward(self, x):
        x = self.conv1(x)
        x = self.relu1(x)
        x = self.max_pooling1(x)
        x = self.conv2(x)
        x = self.relu2(x)
        x = self.max_pooling2(x)
        x = self.dense1(x)
        x = self.dense2(x)
        return x

model = MyNet()
print(model)
'''运行结果为:
MyNet(
  (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (relu1): ReLU()
  (max_pooling1): MaxPool2d(kernel_size=2, stride=1, padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (relu2): ReLU()
  (max_pooling2): MaxPool2d(kernel_size=2, stride=1, padding=0, dilation=1, ceil_mode=False)
  (dense1): Linear(in_features=288, out_features=128, bias=True)
  (dense2): Linear(in_features=128, out_features=10, bias=True)
)
可以看出打印模型后,显示的就是自定义类的属性,且顺序是按照定义顺序,
各个层之间到底是什么连接关系并不能显示,这也是建议将所有层的实现放在构造函数实现的原因,需要模型可视化工具显示
'''

再来看一个例子

import torch
import torch.nn.functional as F


class MyNet(torch.nn.Module):
    def __init__(self):
        # 必须调用父类的构造函数,因为想要使用父类的方法,这也是继承Module的目的
        super(MyNet, self).__init__() 
        self.conv1 = torch.nn.Conv2d(3, 32, 3, 1, 1)
        self.conv2 = torch.nn.Conv2d(3, 32, 3, 1, 1)
        self.dense1 = torch.nn.Linear(32 * 3 * 3, 128)
        self.dense2 = torch.nn.Linear(128, 10)

    def forward(self, x):
        x = self.conv1(x)
        x = F.relu(x)
        x = F.max_pool2d(x)
        x = self.conv2(x)
        x = F.relu(x)
        x = F.max_pool2d(x)
        x = self.dense1(x)
        x = self.dense2(x)
        return x

model = MyNet()
print(model)
'''运行结果为:
MyNet(
  (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv2): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (dense1): Linear(in_features=288, out_features=128, bias=True)
  (dense2): Linear(in_features=128, out_features=10, bias=True)
)
可看出此种方式与上面的区别了吧,打印模型后不知道模型使用激活层,池化层等的超参数配置,没有上面方式一目了然。
'''

这里总结下四种模型构建方式:

方法一:未使用torch.nn.Sequential容器,这种方式一般用来搭建比较简单的模型,对复杂模型不适用。

import torch
import torch.nn.functional as F
from collections import OrderedDict
class Net1(torch.nn.Module):
  def __init__(self):
    super(Net1, self).__init__()
    self.conv1 = torch.nn.Conv2d(3, 32, 3, 1, 1)
    self.dense1 = torch.nn.Linear(32 * 3 * 3, 128)
    self.dense2 = torch.nn.Linear(128, 10)

  def forward(self, x):
    x = F.max_pool2d(F.relu(self.conv(x)), 2)
    x = x.view(x.size(0), -1)
    x = F.relu(self.dense1(x))
    x = self.dense2(x)
    return x

print("Method 1:")
model1 = Net1()
print(model1)
'''
Net1(
  (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (dense1): Linear(in_features=288, out_features=128, bias=True)
  (dense2): Linear(in_features=128, out_features=10, bias=True)
)
可以看出,每层的名字默认是变量名
'''

方法二:利用torch.nn.Sequential()容器(理解成Python list就最简单了)进行快速搭建,模型的各层被顺序添加到容器中。缺点是每层的编号是默认的阿拉伯数字,不易区分。

class Net2(torch.nn.Module):
    def __init__(self):
        super(Net2, self).__init__()
        self.conv = torch.nn.Sequential(
            torch.nn.Conv2d(3, 32, 3, 1, 1),
            torch.nn.ReLU(),
            torch.nn.MaxPool2d(2))
        self.dense = torch.nn.Sequential(
            torch.nn.Linear(32 * 3 * 3, 128),
            torch.nn.ReLU(),
            torch.nn.Linear(128, 10)
        )

    def forward(self, x):
        conv_out = self.conv1(x)
        res = conv_out.view(conv_out.size(0), -1)
        out = self.dense(res)
        return out


print("Method 2:")
model2 = Net2()
print(model2)
print(model2.conv[2])  # 先获取属性,再下标索引,
'''
Net2(
  (conv): Sequential(
    (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): ReLU()
    (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (dense): Sequential(
    (0): Linear(in_features=288, out_features=128, bias=True)
    (1): ReLU()
    (2): Linear(in_features=128, out_features=10, bias=True)
  )
)
可看出模型有两个属性分别是conv和dense, 且他们都是Sequential类型,

MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
'''

方法三:是对第二种方法的改进:通过add_module()添加每一层,并且为每一层增加了一个单独的名字。每层名字不在是默认的索引编号,而是添加的名字

class Net3(torch.nn.Module):
    def __init__(self):
        super(Net3, self).__init__()
        self.conv = torch.nn.Sequential()
        self.conv.add_module("conv1", torch.nn.Conv2d(3, 32, 3, 1, 1))
        self.conv.add_module("relu1", torch.nn.ReLU())
        self.conv.add_module("pool1", torch.nn.MaxPool2d(2))
        self.dense = torch.nn.Sequential()
        self.dense.add_module("dense1", torch.nn.Linear(32 * 3 * 3, 128))
        self.dense.add_module("relu2", torch.nn.ReLU())
        self.dense.add_module("dense2", torch.nn.Linear(128, 10))

    def forward(self, x):
        conv_out = self.conv1(x)
        res = conv_out.view(conv_out.size(0), -1)
        out = self.dense(res)
        return out


print("Method 3:")
model3 = Net3()
print(model3)
# 先获取属性,再下标索引,虽然给每一层添加了名字,但依然是Sequential()类型(list类型),只能索引获取,model3.conv['pool1']错误
print(model3.conv[2])  
'''
Net3(
  (conv): Sequential(
    (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (relu1): ReLU()
    (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (dense): Sequential(
    (dense1): Linear(in_features=288, out_features=128, bias=True)
    (relu2): ReLU()
    (dense2): Linear(in_features=128, out_features=10, bias=True)
  )
)
可看出模型有两个属性分别是conv和dense, 且他们都是Sequential类型,与上面区别只是索引编号换为添加的名字

MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
'''

方法四:是第三种方法的另外一种写法,不采用add_module()添加名字,而通过顺序字典的形式添加每一层,并且设置单独的层名称。

class Net4(torch.nn.Module):
    def __init__(self):
        super(Net4, self).__init__()
        self.conv = torch.nn.Sequential(
            OrderedDict(
                [
                    ("conv1", torch.nn.Conv2d(3, 32, 3, 1, 1)),
                    ("relu1", torch.nn.ReLU()),
                    ("pool1", torch.nn.MaxPool2d(2))
                ]
            ))

        self.dense = torch.nn.Sequential(
            OrderedDict([
                ("dense1", torch.nn.Linear(32 * 3 * 3, 128)),
                ("relu2", torch.nn.ReLU()),
                ("dense2", torch.nn.Linear(128, 10))
            ])
        )

    def forward(self, x):
        conv_out = self.conv1(x)
        res = conv_out.view(conv_out.size(0), -1)
        out = self.dense(res)
        return out


print("Method 4:")
model4 = Net4()
print(model4)
# 先获取属性,再下标索引,虽然给每一层添加了名字,但依然是Sequential()类型(list类型),只能索引获取,model4.conv['pool1']错误
print(model4.conv[2])
'''
Net4(
  (conv): Sequential(
    (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (relu1): ReLU()
    (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (dense): Sequential(
    (dense1): Linear(in_features=288, out_features=128, bias=True)
    (relu2): ReLU()
    (dense2): Linear(in_features=128, out_features=10, bias=True)
  )
)
可看出模型有两个属性分别是conv和dense, 且他们都是Sequential类型,

MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
'''

Module类的几个常见方法使用

1,named_children()和children()方法

只获取模型‘儿子’,不再往深处获取’孙子’, *作用一般是获取模型中某段线性连接,便于快捷搭建模型,如nn.Sequential(list(model.children())[:-2])

class Net4(torch.nn.Module):
    def __init__(self):
        super(Net4, self).__init__()
        self.conv = torch.nn.Sequential(
            OrderedDict(
                [
                    ("conv1", torch.nn.Conv2d(3, 32, 3, 1, 1)),
                    ("relu1", torch.nn.ReLU()),
                    ("pool1", torch.nn.MaxPool2d(2))
                ]
            ))

        self.dense = torch.nn.Sequential(
            OrderedDict([
                ("dense1", torch.nn.Linear(32 * 3 * 3, 128)),
                ("relu2", torch.nn.ReLU()),
                ("dense2", torch.nn.Linear(128, 10))
            ])
        )

    def forward(self, x):
        conv_out = self.conv1(x)
        res = conv_out.view(conv_out.size(0), -1)
        out = self.dense(res)
        return out


print("Method 4:")
model4 = Net4()
# children()方法其实就是获取模型的属性,可看到构造函数有两个属性
for i in model4.children():
    print(i)
    print(type(i))
print('==============================')
# named_children()方法也是获取模型的属性,同时获取属性的名字
for i in model4.named_children():
    print(i)
    print(type(i))
'''
Sequential(
  (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (relu1): ReLU()
  (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)

Sequential(
  (dense1): Linear(in_features=288, out_features=128, bias=True)
  (relu2): ReLU()
  (dense2): Linear(in_features=128, out_features=10, bias=True)
)


==============================
('conv', Sequential(
  (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (relu1): ReLU()
  (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
))

('dense', Sequential(
  (dense1): Linear(in_features=288, out_features=128, bias=True)
  (relu2): ReLU()
  (dense2): Linear(in_features=128, out_features=10, bias=True)
))

'''

2,named_modules()和modules()方法

model的modules()方法和named_modules()方法都会将整个模型的所有构成(包括包装层、单独的层、自定义层等)由浅入深依次遍历出来, 直到最深处的单层,只不过modules()返回的每一个元素是直接返回的层对象本身,而named_modules()返回的每一个元素是一个元组,第一个元素是名称,第二个元素才是层对象本身。主要作用是需要获取模型的每个层的对象时使用,比如模型初始化,模型加载参数等

class Net4(torch.nn.Module):
    def __init__(self):
        super(Net4, self).__init__()
        self.conv = torch.nn.Sequential(
            OrderedDict(
                [
                    ("conv1", torch.nn.Conv2d(3, 32, 3, 1, 1)),
                    ("relu1", torch.nn.ReLU()),
                    ("pool1", torch.nn.MaxPool2d(2))
                ]
            ))

        self.dense = torch.nn.Sequential(
            OrderedDict([
                ("dense1", torch.nn.Linear(32 * 3 * 3, 128)),
                ("relu2", torch.nn.ReLU()),
                ("dense2", torch.nn.Linear(128, 10))
            ])
        )

    def forward(self, x):
        conv_out = self.conv1(x)
        res = conv_out.view(conv_out.size(0), -1)
        out = self.dense(res)
        return out


print("Method 4:")
model4 = Net4()
# modules方法将整个模型的所有构成(包括包装层Sequential、单独的层、自定义层等)由浅入深依次遍历出来,直到最深处的单层
for i in model4.modules():
    print(i)
    print('==============================')
print('=============华丽分割线=================')
# named_modules()同上,但是返回的每一个元素是一个元组,第一个元素是名称,第二个元素才是层对象本身。
for i in model4.named_modules():
    print(i)
    print('==============================')
'''
离模型最近即最浅处,就是模型本身
Net4(
  (conv): Sequential(
    (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (relu1): ReLU()
    (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (dense): Sequential(
    (dense1): Linear(in_features=288, out_features=128, bias=True)
    (relu2): ReLU()
    (dense2): Linear(in_features=128, out_features=10, bias=True)
  )
)
==============================
由浅入深,到模型的属性
Sequential(
  (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (relu1): ReLU()
  (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
==============================
由浅入深,再到模型的属性的内部
Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
==============================
由浅入深,再到模型的属性的内部,依次将这个属性遍历结束
ReLU()
==============================
MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
由浅入深,再到模型的属性的内部,依次将这个属性遍历结束,再遍历另个属性
==============================
Sequential(
  (dense1): Linear(in_features=288, out_features=128, bias=True)
  (relu2): ReLU()
  (dense2): Linear(in_features=128, out_features=10, bias=True)
)
==============================
Linear(in_features=288, out_features=128, bias=True)
==============================
ReLU()
==============================
Linear(in_features=128, out_features=10, bias=True)
==============================
=============华丽分割线=================
('', Net4(
  (conv): Sequential(
    (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (relu1): ReLU()
    (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (dense): Sequential(
    (dense1): Linear(in_features=288, out_features=128, bias=True)
    (relu2): ReLU()
    (dense2): Linear(in_features=128, out_features=10, bias=True)
  )
))
==============================
('conv', Sequential(
  (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (relu1): ReLU()
  (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
))
==============================
('conv.conv1', Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
==============================
('conv.relu1', ReLU())
==============================
('conv.pool1', MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False))
==============================
('dense', Sequential(
  (dense1): Linear(in_features=288, out_features=128, bias=True)
  (relu2): ReLU()
  (dense2): Linear(in_features=128, out_features=10, bias=True)
))
==============================
('dense.dense1', Linear(in_features=288, out_features=128, bias=True))
==============================
('dense.relu2', ReLU())
==============================
('dense.dense2', Linear(in_features=128, out_features=10, bias=True))
==============================
'''

所以需要注意children和modules之间的这种差异性

你可能感兴趣的:(计算机视觉)