MobileNet做目标检测的包在tensorflow/models/object_detection里。先下载一个ssd_mobilenet_v1的预训练模型。
准备VOC格式的数据集
使用 脚本 ../object_detection/cre
ate_pascal_tf_record.py 把VOC数据转化成 tf.record文件。
# DATA
python bread_pascal_record.py --data_dir=/home/wrz/zifu/VOCdevkit2007/VOC2007 --year=VOC2007 --output_path=/hyw/zifu.record --label_map_path=/hyw/zifu_label_map.pbtxt
print data
,看下data[dir]
和data[filename]
,在函数dict_to_tf_example
中对应修改img_path
dict_to_tf_example
中展示了对xml文件读取和创建tf.example类型的过程,可以在这里根据对label的需要做改动网络结构文件 ../object_detection/samples/configs/下有mobilenet_ssd的训练用的网络结构文件ssd_mobilenet_v1_pets.config,修改下面几处:
关于网络结构文件,其结构如下
--model:{} --train_config:{ --optimizer: {} --data_augmentation_options:{} ... } --train_input_reader: {} --eval_config: {} --eval_input_reader: {}
类似caffe.proto对layers的参数定义,tensorflow在object_detect/protos/ 下有对config文件各个部分的参数定义。例如在optimizer.proto中可以找到learning_rate,在train_config.proto中有batch_size:
syntax = "proto2"; package object_detection.protos; import "object_detection/protos/optimizer.proto"; import "object_detection/protos/preprocessor.proto"; message TrainConfig { // Input queue batch size. optional uint32 batch_size = 1 [default=32]; // Data augmentation options. repeated PreprocessingStep data_augmentation_options = 2; // Whether to synchronize replicas during training. optional bool sync_replicas = 3 [default=false]; // How frequently to keep checkpoints. optional uint32 keep_checkpoint_every_n_hours = 4 [default=1000]; // Optimizer used to train the DetectionModel. optional Optimizer optimizer = 5; ......
train_config.proto中
data_augmentation_options = 2
,在preprocessor.proto中对应可以看到数据增强的各种模式选择,默认值是2,即水平反转增强:
syntax = "proto2"; package object_detection.protos; // Message for defining a preprocessing operation on input data. // See: //object_detection/core/preprocessor.py message PreprocessingStep { oneof preprocessing_step { NormalizeImage normalize_image = 1; RandomHorizontalFlip random_horizontal_flip = 2; RandomPixelValueScale random_pixel_value_scale = 3; RandomImageScale random_image_scale = 4; RandomRGBtoGray random_rgb_to_gray = 5; RandomAdjustBrightness random_adjust_brightness = 6; RandomAdjustContrast random_adjust_contrast = 7; RandomAdjustHue random_adjust_hue = 8; RandomAdjustSaturation random_adjust_saturation = 9; RandomDistortColor random_distort_color = 10; RandomJitterBoxes random_jitter_boxes = 11; RandomCropImage random_crop_image = 12; RandomPadImage random_pad_image = 13; RandomCropPadImage random_crop_pad_image = 14; RandomCropToAspectRatio random_crop_to_aspect_ratio = 15; RandomBlackPatches random_black_patches = 16; RandomResizeMethod random_resize_method = 17; ScaleBoxesToPixelCoordinates scale_boxes_to_pixel_coordinates = 18; ResizeImage resize_image = 19; SubtractChannelMean subtract_channel_mean = 20; SSDRandomCrop ssd_random_crop = 21; SSDRandomCropPad ssd_random_crop_pad = 22; SSDRandomCropFixedAspectRatio ssd_random_crop_fixed_aspect_ratio = 23; } }
训练命令**:
python train.py --logtostderr --train_dir=hyw/output/zifu --pipeline_config_path=hyw/ssd_mobilenet_v1_pets.config
模型保存:
在trainer.py中,可以看到模型保存主要用到函数:
saver = tf.train.Saver(keep_checkpoint_every_n_hours=keep_checkpoint_every_n_hours)
在tesorflow官方文档中有对Saver的具体描述:https://www.tensorflow.org/api_docs/python/tf/train/Saver
defined in "tensorflow/python/training/saver.py"
__init__(
var_list=None,
reshape=False,
sharded=False,
max_to_keep=5,
keep_checkpoint_every_n_hours=10000.0,
name=None,
restore_sequentially=False,
saver_def=None,
builder=None,
defer_build=False,
allow_empty=False,
write_version=tf.train.SaverDef.V2,
pad_step_number=False,
save_relative_paths=False,
filename=None
)
Args:
var_list: A list of Variable/SaveableObject, or a dictionary mapping names to SaveableObjects. If None, defaults to the list of all saveable objects.
reshape: If True, allows restoring parameters from a checkpoint where the variables have a different shape.
sharded: If True, shard the checkpoints, one per device.
max_to_keep: Maximum number of recent checkpoints to keep. Defaults to 5.
keep_checkpoint_every_n_hours: How often to keep checkpoints. Defaults to 10,000 hours.
name: String. Optional name to use as a prefix when adding operations.
restore_sequentially: A Bool, which if true, causes restore of different variables to happen sequentially within each device. This can lower memory usage when restoring very large models.
saver_def: Optional SaverDef proto to use instead of running the builder. This is only useful for specialty code that wants to recreate a Saver object for a previously built Graph that had a Saver. The saver_def proto should be the one returned by the as_saver_def() call of the Saver that was created for that Graph.
builder: Optional SaverBuilder to use if a saver_def was not provided. Defaults to BaseSaverBuilder().
defer_build: If True, defer adding the save and restore ops to the build() call. In that case build() should be called before finalizing the graph or using the saver.
allow_empty: If False (default) raise an error if there are no variables in the graph. Otherwise, construct the saver anyway and make it a no-op.
write_version: controls what format to use when saving checkpoints. It also affects certain filepath matching logic. The V2 format is the recommended choice: it is much more optimized than V1 in terms of memory required and latency incurred during restore. Regardless of this flag, the Saver is able to restore from both V2 and V1 checkpoints.
pad_step_number: if True, pads the global step number in the checkpoint filepaths to some fixed width (8 by default). This is turned off by default.
save_relative_paths: If True, will write relative paths to the checkpoint state file. This is needed if the user wants to copy the checkpoint directory and reload from the copied directory.
filename: If known at graph construction time, filename used for variable loading/saving.
其中max_to_keep=5
默认保存最近5次的模型,设置max_to_keep=0为保存所有模型
训练参数
训练时用到的函数是trainer.py中:
slim = tf.contrib.slim
slim.learning.train(
train_tensor,
logdir=train_dir,
master=master,
is_chief=is_chief,
session_config=session_config,
startup_delay_steps=train_config.startup_delay_steps,
init_fn=init_fn,
summary_op=summary_op,
number_of_steps=(
train_config.num_steps if train_config.num_steps else None),
save_summaries_secs=120,
sync_optimizer=sync_optimizer,
saver=saver)
其中 tensorflow/contrib/slim/python/slim/learning.py中,对于train的定义:
def train(train_op,
logdir,
train_step_fn=train_step,
train_step_kwargs=_USE_DEFAULT,
log_every_n_steps=1,
graph=None,
master='',
is_chief=True,
global_step=None,
number_of_steps=None,
init_op=_USE_DEFAULT,
init_feed_dict=None,
local_init_op=_USE_DEFAULT,
init_fn=None,
ready_op=_USE_DEFAULT,
summary_op=_USE_DEFAULT,
save_summaries_secs=600,
summary_writer=_USE_DEFAULT,
startup_delay_steps=0,
saver=None,
save_interval_secs=600,
sync_optimizer=None,
session_config=None,
trace_every_n_steps=None):
其中number_of_steps
定义最大迭代次数,save_summaries_secs=600
默认每隔多少秒存储一次
在 ../object_detection/目录下,命令行输入
jupyter notebook
打开../models/object_detection/object_detection_tutorial.ipynb
运行即可
这个脚本运行使用的模型为.pb文件,tensorflow训练得到的是.meta 和 .ckpt文件,分别存储了图结构和参数值,需要先转化。用脚本../object_detection/export_inference_graph.py:
python export_inference_graph.py --input_type image_tensor --pipeline_config_path ./hyw/ssd_mobilenet_v1_pets.config --checkpoint_path ./hyw/output/zifu/model.ckpt-34940 --inference_graph_path ./hyw/output/model-34940.pb
最后说明,检测耗时包括图像转化成np.array,读取和装载模型,检测图片。如果要保证检测速度,则需要保存 当前使用的graph和sess。
i5 6300HQ上测试,单张检测图片大概0.1秒,但是图片中包含的目标貌似和检测时间成倍数关系。赛扬j3160上测试,1秒一张图,桌面版i7 自己编译的tensorflow要比pip安装的快,速度大概0.04秒多一张图,这个和作者声称的时间一致。