Dropout 层是否有效

之前做分类的时候,Dropout 层一般加在全连接层 防止过拟合 提升模型泛化能力。而很少见到卷积层后接Drop out (原因主要是 卷积参数少,不易过拟合),今天找了些博客,特此记录。

首先是一篇外文博客(他的一系列写的都很好):Dropout Regularization For Neural Networks
也有中文翻译版的:基于Keras/Python的深度学习模型Dropout正则项

You can imagine that if neurons are randomly dropped out of the network during training, that other neurons will have to step in and handle the representation required to make predictions for the missing neurons. This is believed to result in multiple independent internal representations being learned by the network.

The effect is that the network becomes less sensitive to the specific weights of neurons. This in turn results in a network that is capable of better generalization and is less likely to overfit the training data.

在cifar数据集上使用Dropout的实例:92.45% on CIFAR-10 in Torch
这里面卷积层和全连接层都加了Dropout。But dropout values are usually < 0.5, e.g. 0.1, 0.2, 0.3 for the convolutional layers.

在附上提出Dropout的论文中的观点:

from the Srivastava/Hinton dropout paper:

“The additional gain in performance obtained by adding dropout in the convolutional layers (3.02% to 2.55%) is worth noting. One may have presumed that since the convolutional layers don’t have a lot of parameters, overfitting is not a problem and therefore dropout would not have much effect. However, dropout in the lower layers still helps because it provides noisy inputs for the higher fully connected layers which prevents them from overfitting.”
They use 0.7 prob for conv drop out and 0.5 for fully connected.

这次实验我是在输入层后加入了Dropout层,感觉像是数据扩增,还不知道效果如何。

你可能感兴趣的:(杂记)