最近公司的8GPU的服务器就我一个人用,为了不浪费资源,自己瞎捣鼓捣鼓模型,搭个DenseNet玩玩,跑跑过大名鼎鼎的ImageNet。在实现的过程中发现了tensorflow的一个很大的缺点,希望有人感觉把这坑填上。
DenseNet的想法和结构在CNN进化史里介绍过,就不赘述了,直接说实现的细节:
For convolutional layers with kernel size 3x3, each side of the inputs is zero-padded by one pixel to keep the feature-map size fixed.
For preprocessing, we normalize the data using the channel means and standard
deviations.
We adopt a standard data augmentation scheme (mirroring/shifting).
All the networks are trained using stochastic gradient descent (SGD). On CIFAR and SVHN we train using batch size 64 for 300 and 40 epochs, respectively. The initial learning rate is set to 0.1, and is divided by 10 at 50% and 75% of the total number of training epochs.(完全仿照ResNet的训练参数)
we use a weight decay of 10^-4 and a Nesterov momentum of 0.9 without dampening.(仿照ResNet)
we add a dropout layer after each convolutional layer (except the first one) and set the dropout rate to 0.2.
现在开始介绍我碰到的大坑,就是tensorflow的concatenate需要把完全一样在复制一遍,居然不能索引。海量的内存被浪费,坑,希望改进改进。
作者们给出的高效实现方案
我的实现:https://github.com/RDShi/DenseNet
顺势案例一波tflearn,感觉十分的方便,终于不用些繁琐的tensorflow了,热泪盈眶。
下面简介一下tflean:
File | Layers |
---|---|
core | input_data, fully_connected, dropout, custom_layer, reshape, flatten, activation, single_unit, highway, one_hot_encoding, time_distributed |
conv | conv_2d, conv_2d_transpose, max_pool_2d, avg_pool_2d, upsample_2d, conv_1d, max_pool_1d, avg_pool_1d, residual_block, residual_bottleneck, conv_3d, max_pool_3d, avg_pool_3d, highway_conv_1d, highway_conv_2d, global_avg_pool, global_max_pool |
recurrent | simple_rnn, lstm, gru, bidirectionnal_rnn, dynamic_rnn |
embedding | embedding |
normalization | batch_normalization, local_response_normalization, l2_normalize |
merge | merge, merge_outputs |
estimator | regression |
File | Ops |
---|---|
activations | linear, tanh, sigmoid, softmax, softplus, softsign, relu, relu6, leaky_relu, prelu, elu |
objectives | softmax_categorical_crossentropy, categorical_crossentropy, binary_crossentropy, mean_square, hinge_loss, roc_auc_score, weak_cross_entropy_2d |
optimizers | SGD, RMSProp, Adam, Momentum, AdaGrad, Ftrl, AdaDelta |
metrics | Accuracy, Top_k, R2 |
initializations | zeros, uniform, uniform_scaling, normal, truncated_normal, xavier, variance_scaling |
losses | l1, l2 |
Training:
network = ... (some layers) ...
network = regression(network, optimizer='sgd', loss='categorical_crossentropy')
model = DNN(network)
model.fit(X, Y)
network = ...
model = DNN(network)
model.load('model.tflearn')
model.predict(X)
TFLearn提供不同详细程度(verbose level)的可视化效果:
通过tensorboard查看驯良情况。
# Save a model
model.save('my_model.tflearn')
# Load a model
model.load('my_model.tflearn')
load会把除了权重以外所有的参数都load进来,如果只想要参数加入字段weights_only=True。因为有时候会改变optimizer之类的东西。
# Weights will be restored by default.
fc_layer = tflearn.fully_connected(input_layer, 32)
# Weights will not be restored, if specified so.
fc_layer = tflearn.fully_connected(input_layer, 32, restore='False')
TFLearn data stream设计了computing pipelines来加速训练(再CPU上进行数据预处理,在GPU上训练模型)。
# Real-time image preprocessing
img_prep = tflearn.ImagePreprocessing()
# Zero Center (With mean computed over the whole dataset)
img_prep.add_featurewise_zero_center()
# STD Normalization (With std computed over the whole dataset)
img_prep.add_featurewise_stdnorm()
# Real-time data augmentation
img_aug = tflearn.ImageAugmentation()
# Random flip an image
img_aug.add_random_flip_leftright()
# Random crop an image
img_aug.add_random_crop([32, 32], padding=4)
# Add these methods into an 'input_data' layer
network = input_data(shape=[None, 32, 32, 3],
data_preprocessing=img_prep,
data_augmentation=img_aug)
配置gpu使用率之类的
tflearn.init_graph(set_seed=8888, num_cores=16, gpu_memory_fraction=0.5)
tip:
设置GPU内存根据需求增长
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
tf.add_to_collection(tf.GraphKeys.GRAPH_CONFIG, config)