在使用DNN做图像分类的时候,就有个想法,在构建神经网络的时候,网络层数与每一层的单元数,以及整个网络所有参数,对学习效果的影响,分别是怎样的。
于是我进行了一个实验,验证网络层数与每层单元数的关系对学习效果的影响,当把神经网络总的参数固定下来的时候,网络层数和每一层的单元数,对应的结果可视化如下:
第1个随机模型,拥有0个隐藏层
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 12578) 9873730
_________________________________________________________________
dense_2 (Dense) (None, 10) 125790
=================================================================
Total params: 9,999,520
Trainable params: 9,999,520
Non-trainable params: 0
_________________________________________________________________
用时:538.0814208984375s
第2个随机模型,拥有1个隐藏层
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 3004) 2358140
_________________________________________________________________
dense_2 (Dense) (None, 2534) 7614670
_________________________________________________________________
dense_3 (Dense) (None, 10) 25350
=================================================================
Total params: 9,998,160
Trainable params: 9,998,160
Non-trainable params: 0
_________________________________________________________________
用时:523.6505570411682s
第3个随机模型,拥有2个隐藏层
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 3129) 2456265
_________________________________________________________________
dense_2 (Dense) (None, 1061) 3320930
_________________________________________________________________
dense_3 (Dense) (None, 3939) 4183218
_________________________________________________________________
dense_4 (Dense) (None, 10) 39400
=================================================================
Total params: 9,999,813
Trainable params: 9,999,813
Non-trainable params: 0
_________________________________________________________________
用时:573.7039511203766s
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
第4个随机模型,拥有3个隐藏层
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 1509) 1184565
_________________________________________________________________
dense_2 (Dense) (None, 425) 641750
_________________________________________________________________
dense_3 (Dense) (None, 6219) 2649294
_________________________________________________________________
dense_4 (Dense) (None, 886) 5510920
_________________________________________________________________
dense_5 (Dense) (None, 10) 8870
=================================================================
Total params: 9,995,399
Trainable params: 9,995,399
Non-trainable params: 0
_________________________________________________________________
用时:586.3007328510284s
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
第5个随机模型,拥有4个隐藏层
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 564) 442740
_________________________________________________________________
dense_2 (Dense) (None, 2446) 1381990
_________________________________________________________________
dense_3 (Dense) (None, 158) 386626
_________________________________________________________________
dense_4 (Dense) (None, 10117) 1608603
_________________________________________________________________
dense_5 (Dense) (None, 610) 6171980
_________________________________________________________________
dense_6 (Dense) (None, 10) 6110
=================================================================
Total params: 9,998,049
Trainable params: 9,998,049
Non-trainable params: 0
_________________________________________________________________
用时:636.9589238166809s
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
第6个随机模型,拥有5个隐藏层
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 1283) 1007155
_________________________________________________________________
dense_2 (Dense) (None, 1246) 1599864
_________________________________________________________________
dense_3 (Dense) (None, 1410) 1758270
_________________________________________________________________
dense_4 (Dense) (None, 579) 816969
_________________________________________________________________
dense_5 (Dense) (None, 224) 129920
_________________________________________________________________
dense_6 (Dense) (None, 19948) 4488300
_________________________________________________________________
dense_7 (Dense) (None, 10) 199490
=================================================================
Total params: 9,999,968
Trainable params: 9,999,968
Non-trainable params: 0
_________________________________________________________________
用时:679.5819964408875s
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
第7个随机模型,拥有6个隐藏层
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 689) 540865
_________________________________________________________________
dense_2 (Dense) (None, 1115) 769350
_________________________________________________________________
dense_3 (Dense) (None, 1440) 1607040
_________________________________________________________________
dense_4 (Dense) (None, 566) 815606
_________________________________________________________________
dense_5 (Dense) (None, 510) 289170
_________________________________________________________________
dense_6 (Dense) (None, 1757) 897827
_________________________________________________________________
dense_7 (Dense) (None, 2873) 5050734
_________________________________________________________________
dense_8 (Dense) (None, 10) 28740
=================================================================
Total params: 9,999,332
Trainable params: 9,999,332
Non-trainable params: 0
_________________________________________________________________
用时:634.280168056488s
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
第8个随机模型,拥有7个隐藏层
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 1578) 1238730
_________________________________________________________________
dense_2 (Dense) (None, 577) 911083
_________________________________________________________________
dense_3 (Dense) (None, 2133) 1232874
_________________________________________________________________
dense_4 (Dense) (None, 311) 663674
_________________________________________________________________
dense_5 (Dense) (None, 2246) 700752
_________________________________________________________________
dense_6 (Dense) (None, 195) 438165
_________________________________________________________________
dense_7 (Dense) (None, 8642) 1693832
_________________________________________________________________
dense_8 (Dense) (None, 360) 3111480
_________________________________________________________________
dense_9 (Dense) (None, 10) 3610
=================================================================
Total params: 9,994,200
Trainable params: 9,994,200
Non-trainable params: 0
_________________________________________________________________
用时:699.1033000946045s
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
第9个随机模型,拥有8个隐藏层
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 898) 704930
_________________________________________________________________
dense_2 (Dense) (None, 1084) 974516
_________________________________________________________________
dense_3 (Dense) (None, 974) 1056790
_________________________________________________________________
dense_4 (Dense) (None, 228) 222300
_________________________________________________________________
dense_5 (Dense) (None, 2805) 642345
_________________________________________________________________
dense_6 (Dense) (None, 58) 162748
_________________________________________________________________
dense_7 (Dense) (None, 13058) 770422
_________________________________________________________________
dense_8 (Dense) (None, 108) 1410372
_________________________________________________________________
dense_9 (Dense) (None, 34080) 3714720
_________________________________________________________________
dense_10 (Dense) (None, 10) 340810
=================================================================
Total params: 9,999,953
Trainable params: 9,999,953
Non-trainable params: 0
_________________________________________________________________
用时:805.9431340694427s
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
第10个随机模型,拥有9个隐藏层
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 640) 502400
_________________________________________________________________
dense_2 (Dense) (None, 510) 326910
_________________________________________________________________
dense_3 (Dense) (None, 16) 8176
_________________________________________________________________
dense_4 (Dense) (None, 59593) 1013081
_________________________________________________________________
dense_5 (Dense) (None, 7) 417158
_________________________________________________________________
dense_6 (Dense) (None, 162643) 1301144
_________________________________________________________________
dense_7 (Dense) (None, 8) 1301152
_________________________________________________________________
dense_8 (Dense) (None, 57961) 521649
_________________________________________________________________
dense_9 (Dense) (None, 19) 1101278
_________________________________________________________________
dense_10 (Dense) (None, 116901) 2338020
_________________________________________________________________
dense_11 (Dense) (None, 10) 1169020
=================================================================
Total params: 9,999,988
Trainable params: 9,999,988
Non-trainable params: 0
_________________________________________________________________
用时:2082.286678314209s
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
第11个随机模型,拥有10个隐藏层
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 146) 114610
_________________________________________________________________
dense_2 (Dense) (None, 427) 62769
_________________________________________________________________
dense_3 (Dense) (None, 519) 222132
_________________________________________________________________
dense_4 (Dense) (None, 1581) 822120
_________________________________________________________________
dense_5 (Dense) (None, 785) 1241870
_________________________________________________________________
dense_6 (Dense) (None, 1392) 1094112
_________________________________________________________________
dense_7 (Dense) (None, 25) 34825
_________________________________________________________________
dense_8 (Dense) (None, 50879) 1322854
_________________________________________________________________
dense_9 (Dense) (None, 28) 1424640
_________________________________________________________________
dense_10 (Dense) (None, 16031) 464899
_________________________________________________________________
dense_11 (Dense) (None, 199) 3190368
_________________________________________________________________
dense_12 (Dense) (None, 10) 2000
=================================================================
Total params: 9,997,199
Trainable params: 9,997,199
Non-trainable params: 0
_________________________________________________________________
用时:903.0978591442108s
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
第12个随机模型,拥有11个隐藏层
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 397) 311645
_________________________________________________________________
dense_2 (Dense) (None, 595) 236810
_________________________________________________________________
dense_3 (Dense) (None, 994) 592424
_________________________________________________________________
dense_4 (Dense) (None, 287) 285565
_________________________________________________________________
dense_5 (Dense) (None, 1273) 366624
_________________________________________________________________
dense_6 (Dense) (None, 392) 499408
_________________________________________________________________
dense_7 (Dense) (None, 1359) 534087
_________________________________________________________________
dense_8 (Dense) (None, 850) 1156000
_________________________________________________________________
dense_9 (Dense) (None, 295) 251045
_________________________________________________________________
dense_10 (Dense) (None, 3382) 1001072
_________________________________________________________________
dense_11 (Dense) (None, 565) 1911395
_________________________________________________________________
dense_12 (Dense) (None, 4954) 2803964
_________________________________________________________________
dense_13 (Dense) (None, 10) 49550
=================================================================
Total params: 9,999,589
Trainable params: 9,999,589
Non-trainable params: 0
_________________________________________________________________
用时:736.6282951831818s
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
第13个随机模型,拥有12个隐藏层
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 506) 397210
_________________________________________________________________
dense_2 (Dense) (None, 519) 263133
_________________________________________________________________
dense_3 (Dense) (None, 787) 409240
_________________________________________________________________
dense_4 (Dense) (None, 213) 167844
_________________________________________________________________
dense_5 (Dense) (None, 227) 48578
_________________________________________________________________
dense_6 (Dense) (None, 4284) 976752
_________________________________________________________________
dense_7 (Dense) (None, 163) 698455
_________________________________________________________________
dense_8 (Dense) (None, 196) 32144
_________________________________________________________________
dense_9 (Dense) (None, 3710) 730870
_________________________________________________________________
dense_10 (Dense) (None, 188) 697668
_________________________________________________________________
dense_11 (Dense) (None, 7205) 1361745
_________________________________________________________________
dense_12 (Dense) (None, 278) 2003268
_________________________________________________________________
dense_13 (Dense) (None, 7657) 2136303
_________________________________________________________________
dense_14 (Dense) (None, 10) 76580
=================================================================
Total params: 9,999,790
Trainable params: 9,999,790
Non-trainable params: 0
_________________________________________________________________
用时:805.9904351234436s
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
第14个随机模型,拥有13个隐藏层
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 583) 457655
_________________________________________________________________
dense_2 (Dense) (None, 1153) 673352
_________________________________________________________________
dense_3 (Dense) (None, 88) 101552
_________________________________________________________________
dense_4 (Dense) (None, 6637) 590693
_________________________________________________________________
dense_5 (Dense) (None, 69) 458022
_________________________________________________________________
dense_6 (Dense) (None, 10935) 765450
_________________________________________________________________
dense_7 (Dense) (None, 26) 284336
_________________________________________________________________
dense_8 (Dense) (None, 2425) 65475
_________________________________________________________________
dense_9 (Dense) (None, 143) 346918
_________________________________________________________________
dense_10 (Dense) (None, 2978) 428832
_________________________________________________________________
dense_11 (Dense) (None, 103) 306837
_________________________________________________________________
dense_12 (Dense) (None, 4238) 440752
_________________________________________________________________
dense_13 (Dense) (None, 129) 546831
_________________________________________________________________
dense_14 (Dense) (None, 32380) 4209400
_________________________________________________________________
dense_15 (Dense) (None, 10) 323810
=================================================================
Total params: 9,999,915
Trainable params: 9,999,915
Non-trainable params: 0
_________________________________________________________________
用时:921.6303458213806s
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
第15个随机模型,拥有14个隐藏层
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 41) 32185
_________________________________________________________________
dense_2 (Dense) (None, 1822) 76524
_________________________________________________________________
dense_3 (Dense) (None, 23) 41929
_________________________________________________________________
dense_4 (Dense) (None, 11587) 278088
_________________________________________________________________
dense_5 (Dense) (None, 18) 208584
_________________________________________________________________
dense_6 (Dense) (None, 34032) 646608
_________________________________________________________________
dense_7 (Dense) (None, 20) 680660
_________________________________________________________________
dense_8 (Dense) (None, 18654) 391734
_________________________________________________________________
dense_9 (Dense) (None, 28) 522340
_________________________________________________________________
dense_10 (Dense) (None, 5530) 160370
_________________________________________________________________
dense_11 (Dense) (None, 39) 215709
_________________________________________________________________
dense_12 (Dense) (None, 40388) 1615520
_________________________________________________________________
dense_13 (Dense) (None, 3) 121167
_________________________________________________________________
dense_14 (Dense) (None, 45287) 181148
_________________________________________________________________
dense_15 (Dense) (None, 106) 4800528
_________________________________________________________________
dense_16 (Dense) (None, 10) 1070
=================================================================
Total params: 9,974,164
Trainable params: 9,974,164
Non-trainable params: 0
_________________________________________________________________
用时:1267.0370609760284s
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
第16个随机模型,拥有15个隐藏层
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 598) 469430
_________________________________________________________________
dense_2 (Dense) (None, 741) 443859
_________________________________________________________________
dense_3 (Dense) (None, 773) 573566
_________________________________________________________________
dense_4 (Dense) (None, 193) 149382
_________________________________________________________________
dense_5 (Dense) (None, 123) 23862
_________________________________________________________________
dense_6 (Dense) (None, 5741) 711884
_________________________________________________________________
dense_7 (Dense) (None, 86) 493812
_________________________________________________________________
dense_8 (Dense) (None, 2113) 183831
_________________________________________________________________
dense_9 (Dense) (None, 311) 657454
_________________________________________________________________
dense_10 (Dense) (None, 808) 252096
_________________________________________________________________
dense_11 (Dense) (None, 748) 605132
_________________________________________________________________
dense_12 (Dense) (None, 888) 665112
_________________________________________________________________
dense_13 (Dense) (None, 856) 760984
_________________________________________________________________
dense_14 (Dense) (None, 657) 563049
_________________________________________________________________
dense_15 (Dense) (None, 2593) 1706194
_________________________________________________________________
dense_16 (Dense) (None, 668) 1732792
_________________________________________________________________
dense_17 (Dense) (None, 10) 6690
=================================================================
Total params: 9,999,129
Trainable params: 9,999,129
Non-trainable params: 0
_________________________________________________________________
用时:813.4983413219452s
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
第17个随机模型,拥有16个隐藏层
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 130) 102050
_________________________________________________________________
dense_2 (Dense) (None, 1383) 181173
_________________________________________________________________
dense_3 (Dense) (None, 7) 9688
_________________________________________________________________
dense_4 (Dense) (None, 60444) 483552
_________________________________________________________________
dense_5 (Dense) (None, 9) 544005
_________________________________________________________________
dense_6 (Dense) (None, 4362) 43620
_________________________________________________________________
dense_7 (Dense) (None, 158) 689354
_________________________________________________________________
dense_8 (Dense) (None, 777) 123543
_________________________________________________________________
dense_9 (Dense) (None, 1071) 833238
_________________________________________________________________
dense_10 (Dense) (None, 808) 866176
_________________________________________________________________
dense_11 (Dense) (None, 748) 605132
_________________________________________________________________
dense_12 (Dense) (None, 291) 217959
_________________________________________________________________
dense_13 (Dense) (None, 2127) 621084
_________________________________________________________________
dense_14 (Dense) (None, 328) 697984
_________________________________________________________________
dense_15 (Dense) (None, 2178) 716562
_________________________________________________________________
dense_16 (Dense) (None, 747) 1627713
_________________________________________________________________
dense_17 (Dense) (None, 2159) 1614932
_________________________________________________________________
dense_18 (Dense) (None, 10) 21600
=================================================================
Total params: 9,999,365
Trainable params: 9,999,365
Non-trainable params: 0
_________________________________________________________________
用时:1054.3112881183624s
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
第18个随机模型,拥有17个隐藏层
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 230) 180550
_________________________________________________________________
dense_2 (Dense) (None, 1640) 378840
_________________________________________________________________
dense_3 (Dense) (None, 182) 298662
_________________________________________________________________
dense_4 (Dense) (None, 204) 37332
_________________________________________________________________
dense_5 (Dense) (None, 1118) 229190
_________________________________________________________________
dense_6 (Dense) (None, 331) 370389
_________________________________________________________________
dense_7 (Dense) (None, 1686) 559752
_________________________________________________________________
dense_8 (Dense) (None, 314) 529718
_________________________________________________________________
dense_9 (Dense) (None, 164) 51660
_________________________________________________________________
dense_10 (Dense) (None, 1611) 265815
_________________________________________________________________
dense_11 (Dense) (None, 420) 677040
_________________________________________________________________
dense_12 (Dense) (None, 9) 3789
_________________________________________________________________
dense_13 (Dense) (None, 86197) 861970
_________________________________________________________________
dense_14 (Dense) (None, 8) 689584
_________________________________________________________________
dense_15 (Dense) (None, 134978) 1214802
_________________________________________________________________
dense_16 (Dense) (None, 8) 1079832
_________________________________________________________________
dense_17 (Dense) (None, 4025) 36225
_________________________________________________________________
dense_18 (Dense) (None, 628) 2528328
_________________________________________________________________
dense_19 (Dense) (None, 10) 6290
=================================================================
Total params: 9,999,768
Trainable params: 9,999,768
Non-trainable params: 0
_________________________________________________________________
用时:1609.6508061885834s
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
第19个随机模型,拥有18个隐藏层
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 627) 492195
_________________________________________________________________
dense_2 (Dense) (None, 245) 153860
_________________________________________________________________
dense_3 (Dense) (None, 168) 41328
_________________________________________________________________
dense_4 (Dense) (None, 2475) 418275
_________________________________________________________________
dense_5 (Dense) (None, 35) 86660
_________________________________________________________________
dense_6 (Dense) (None, 15148) 545328
_________________________________________________________________
dense_7 (Dense) (None, 11) 166639
_________________________________________________________________
dense_8 (Dense) (None, 10443) 125316
_________________________________________________________________
dense_9 (Dense) (None, 63) 657972
_________________________________________________________________
dense_10 (Dense) (None, 2781) 177984
_________________________________________________________________
dense_11 (Dense) (None, 47) 130754
_________________________________________________________________
dense_12 (Dense) (None, 3047) 146256
_________________________________________________________________
dense_13 (Dense) (None, 13) 39624
_________________________________________________________________
dense_14 (Dense) (None, 47106) 659484
_________________________________________________________________
dense_15 (Dense) (None, 23) 1083461
_________________________________________________________________
dense_16 (Dense) (None, 23149) 555576
_________________________________________________________________
dense_17 (Dense) (None, 36) 833400
_________________________________________________________________
dense_18 (Dense) (None, 28123) 1040551
_________________________________________________________________
dense_19 (Dense) (None, 94) 2643656
_________________________________________________________________
dense_20 (Dense) (None, 10) 950
=================================================================
Total params: 9,999,269
Trainable params: 9,999,269
Non-trainable params: 0
_________________________________________________________________
用时:1194.6408140659332s
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
可以看到,随着网络层数的增加,总参数不变,模型的准确率逐步升高然后下降,损失在下降过程中发生了很大幅度的振荡,说明针对于训练集的学习效果也很不好,由于更新参数的时候,求导是根据链式法则来更新,所以层数多的网络会出现了梯度消失或者梯度爆炸的问题,后面层数的单元可以更新,前面层数的单元更新的很慢,所以拟合的很差,在验证集上的表现更差。
除了网络层数这个影响,每一层的单元数也是个很重要的指标,就拿最简单的全连接层来说,如果第l-1层的单元数是1000,第l层的单元数是5,第l+1层的单元数是1000,那么第l层能获取到的特征信息就非常少,第l-1层1000个维度的信息被压缩到5个维度(这个可以参考PCA或者自编码),信息压缩很严重,然后第l+1层具有1000个单元,从l-1层5个单元获取到的信息有限,所以也会导致学习效果降低,所以在构建深层网络的时候,根据实际需要应尽量避免出现每一层之间的单元数数量差异过大,造成模型学习效果降低。
以第10个随机模型为例子,拥有9个隐藏层,虽然层数很深,但通过keras的summary可以看到,其中第3层有16个单元,第4层有59593个单元,然后第5层又只有7个单元,第6层又162643个单元,后面的层也是类似的情况,层与层之间的特征传递被压缩的很厉害。最后在训练集上的准确率是0.0988,连0.01都不到,至于测试集则是0.0979左右,这个模型基本起不到任何作用。
第10个随机模型,拥有9个隐藏层
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 640) 502400
_________________________________________________________________
dense_2 (Dense) (None, 510) 326910
_________________________________________________________________
dense_3 (Dense) (None, 16) 8176
_________________________________________________________________
dense_4 (Dense) (None, 59593) 1013081
_________________________________________________________________
dense_5 (Dense) (None, 7) 417158
_________________________________________________________________
dense_6 (Dense) (None, 162643) 1301144
_________________________________________________________________
dense_7 (Dense) (None, 8) 1301152
_________________________________________________________________
dense_8 (Dense) (None, 57961) 521649
_________________________________________________________________
dense_9 (Dense) (None, 19) 1101278
_________________________________________________________________
dense_10 (Dense) (None, 116901) 2338020
_________________________________________________________________
dense_11 (Dense) (None, 10) 1169020
=================================================================
Total params: 9,999,988
Trainable params: 9,999,988
Non-trainable params: 0
_________________________________________________________________
用时:2082.286678314209s
然后就是epoch(数据集训练次数),因为在训练的时候,一次epoch肯定无法学习到有效的信息,所以都是多次epoch,来提高学习效果,只是epoch过多,会导致模型在训练集上过拟合,可以从上面的训练图看出,在100次左右,训练集和验证集的准确率都很高,然后在1000次之后,模型在验证集上的准确率下降了,说明模型已经在训练集上过拟合了。所以,epoch的设置也要根据实际情况来定,也可以加上正则化惩罚项来防止过拟合。配合可视化工具看看准确率,可以有一个直观的判断。
最后一个就是网络层数对于训练时长及拟合度的问题,随着网络层数的加深,训练时间长,学习效果差,所以针对深层网络有提出DropOut,ResNet提高模型训练速度及准确率。
上述测试代码贴出如下:
注:使用keras框架,后端是TensorFlow,使用了keras自带的minist数据集,网络层方面则是全连接层,用2080ti显卡进行计算,总时长大概是4.3个小时,如果想要加快训练速度,可以减少总参数的数目,以及循环生成的模型个数。
from keras.utils import np_utils
import numpy as np
from keras import backend as K
from keras.models import Sequential
from keras.layers import Dense
import matplotlib.pyplot as plt
import pandas as pd
import random
import os
import time
import tensorflow as tf
from keras.datasets import mnist
np.random.seed(10)
(x_train_image,y_train_label),(x_test_image,y_test_label)= mnist.load_data()
#数据预处理模块
x_Train =x_train_image.reshape(60000, 784).astype('float32')
x_Test = x_test_image.reshape(10000, 784).astype('float32')
x_Train_normalize = x_Train / 255
x_Test_normalize = x_Test / 255
#进行oneHot编码
y_Train_OneHot = np_utils.to_categorical(y_train_label)
y_Test_OneHot = np_utils.to_categorical(y_test_label)
#构建网络模型
#Max_DNN_Layers 最大层数
#Input_Units 输入单元数
#Output_Units 输出单元数
#Total_Params 总参数
#x_train 训练集数据
#y_train 训练集标签
def BuildDNNModel(Max_DNN_Layers,Input_Units,Output_Units,Total_Params,x_train,y_train):
if(Total_Params<1000):
Total_Params=1000
if(Input_Units<2 or Output_Units<0):
return
my_units=[]
params_remaining=Total_Params
params_prev=Input_Units
if(Max_DNN_Layers<=1):
my_units.append(int((Total_Params-Output_Units)/(Input_Units+Output_Units+1)))
else:
for i in range(Max_DNN_Layers-1):
params_limit=int((params_remaining/(params_prev+1))/(Max_DNN_Layers-i))
if(params_limit<4):
params_limit=2
params_now=random.randint(2,params_limit)
my_units.append(params_now)
params_remaining=params_remaining-(params_prev+1)*params_now
params_prev=params_now
if params_remaining