[技术分享]神经网络中的动态图

Paddle2中支持动态图,动态图是指神经网络的结构可以动态变化,build with run,类似于Python语言中变量不需要定义好类型就可以赋值。这样带来的好处是使得网络更加灵活,更具运行的情况来调整数据流走向,确定就是运行比较慢,这和Python的缺点一样。参考一、参考二。

例如:

import paddle
import paddle.nn.functional as F
import numpy as np


class MyModel(paddle.nn.Layer):
    def __init__(self, input_size, hidden_size):
        super(MyModel, self).__init__()
        self.linear1 = paddle.nn.Linear(input_size, hidden_size)
        self.linear2 = paddle.nn.Linear(hidden_size, hidden_size)
        self.linear3 = paddle.nn.Linear(hidden_size, 1)

    def forward(self, inputs):
        x = self.linear1(inputs)
        x = F.relu(x)

        if paddle.rand([1,]) > 0.5: 
            x = self.linear2(x)
            x = F.relu(x)

        x = self.linear3(x)
        
        return x     



total_data, batch_size, input_size, hidden_size = 1000, 64, 128, 256

x_data = np.random.randn(total_data, input_size).astype(np.float32)
y_data = np.random.randn(total_data, 1).astype(np.float32)

model = MyModel(input_size, hidden_size)
paddle.summary(model, (input_size))

loss_fn = paddle.nn.MSELoss(reduction='mean')
optimizer = paddle.optimizer.SGD(learning_rate=0.01, 
                                 parameters=model.parameters())

for t in range(200 * (total_data // batch_size)):
    idx = np.random.choice(total_data, batch_size, replace=False)
    x = paddle.to_tensor(x_data[idx,:])
    y = paddle.to_tensor(y_data[idx,:])
    y_pred = model(x)

    loss = loss_fn(y_pred, y)
    if t % 200 == 0:
        print(t, loss.numpy())

    loss.backward()
    optimizer.step()
    optimizer.clear_grad()

上述代码中第二层的隐藏层出现的概率是50%。网络的结构可能是这样:

---------------------------------------------------------------------------
 Layer (type)       Input Shape          Output Shape         Param #    
===========================================================================
   Linear-19          [[128]]               [256]             33,024     
   Linear-20          [[256]]               [256]             65,792     
   Linear-21          [[256]]                [1]                257      
===========================================================================
Total params: 99,073
Trainable params: 99,073
Non-trainable params: 0
---------------------------------------------------------------------------
Input size (MB): 0.00
Forward/backward pass size (MB): 0.00
Params size (MB): 0.38
Estimated Total Size (MB): 0.38
---------------------------------------------------------------------------

再次运行也可能是这样:

---------------------------------------------------------------------------
 Layer (type)       Input Shape          Output Shape         Param #    
===========================================================================
   Linear-25          [[128]]               [256]             33,024     
   Linear-27          [[256]]                [1]                257      
===========================================================================
Total params: 33,281
Trainable params: 33,281
Non-trainable params: 0
---------------------------------------------------------------------------
Input size (MB): 0.00
Forward/backward pass size (MB): 0.00
Params size (MB): 0.13
Estimated Total Size (MB): 0.13
---------------------------------------------------------------------------

那么网络的结构在什么时候确定下来呢?是定义模型的时候?是训练模型的时候,还是模型推理的时候?

在上述代码最后加上paddle.summary(model, (input_size))查看训练后的网络:

---------------------------------------------------------------------------
 Layer (type)       Input Shape          Output Shape         Param #    
===========================================================================
   Linear-34          [[128]]               [256]             33,024     
   Linear-35          [[256]]               [256]             65,792     
   Linear-36          [[256]]                [1]                257      
===========================================================================
Total params: 99,073
Trainable params: 99,073
Non-trainable params: 0
---------------------------------------------------------------------------
Input size (MB): 0.00
Forward/backward pass size (MB): 0.00
Params size (MB): 0.38
Estimated Total Size (MB): 0.38
---------------------------------------------------------------------------

运行以下模型推理:

x = np.random.randn(input_size).astype(np.float32)
x =  paddle.to_tensor(x)
y_infer = model(x)
print(y_infer)

paddle.summary(model, (input_size))

结果也有两种:

Tensor(shape=[1], dtype=float32, place=CUDAPlace(0), stop_gradient=False,
       [0.94865662])
---------------------------------------------------------------------------
 Layer (type)       Input Shape          Output Shape         Param #    
===========================================================================
   Linear-34          [[128]]               [256]             33,024     
   Linear-36          [[256]]                [1]                257      
===========================================================================
Total params: 33,281
Trainable params: 33,281
Non-trainable params: 0
---------------------------------------------------------------------------
Input size (MB): 0.00
Forward/backward pass size (MB): 0.00
Params size (MB): 0.13
Estimated Total Size (MB): 0.13
---------------------------------------------------------------------------

{'total_params': 33281, 'trainable_params': 33281}

或者

Tensor(shape=[1], dtype=float32, place=CUDAPlace(0), stop_gradient=False,
       [-0.08627059])
---------------------------------------------------------------------------
 Layer (type)       Input Shape          Output Shape         Param #    
===========================================================================
   Linear-34          [[128]]               [256]             33,024     
   Linear-35          [[256]]               [256]             65,792     
   Linear-36          [[256]]                [1]                257      
===========================================================================
Total params: 99,073
Trainable params: 99,073
Non-trainable params: 0
---------------------------------------------------------------------------
Input size (MB): 0.00
Forward/backward pass size (MB): 0.00
Params size (MB): 0.38
Estimated Total Size (MB): 0.38
---------------------------------------------------------------------------

这说明模型的大小在定义时是动态的,训练时是动态的,推理时还是动态的。以上代码可在百度AI Studio的notebook中运行。

你可能感兴趣的:(笔记,神经网络,python)