UPDATE:如果想对自己滴数据进行训练,参见这一篇《(超详细很完整)tensorflow下利用deeplabv3+对自己的数据进行训练》,不过得先完成本文滴配置~
最近在做语义分割,于是实现deeplabv3+
我的环境:
ubuntu 16.04
anaconda3
tensorflow-gpu 1.11.0
首先clone官方提供的tensorflow/models
文件。
git clone https://github.com/tensorflow/models.git
如果嫌弃下载慢,可以参看我的另外一篇blog,介绍了如何实现下载速度质的飞跃。
对于这里选择clone不同branch可能会导致的问题,参见Issue #6567。问题焦点在于master
branch可能会出现Eval.py
不出结果,切换r1.12.0
branch可能有用。
测试一下环境配置是否成功。
添加依赖库到PYTHONPATH,在目录/home/user/models/research/
下:
# From /home/user/models/research/
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
source ~/.bashrc
调用model_test.py
测试:
# From /home/user/models/research/
python deeplab/model_test.py
实现过程使用Cityscapes Dataset
作为数据集。
从www.cityscapes-dataset.com上下载Cityscapes Dataset
——数据集leftImg8bit_trainvaltest.zip (11GB)
和对应的标注集gtFine_trainvaltest.zip (241MB)
。下载完成后解压到目录/home/user/data/cityscapesScripts/
(路径可自己选择)。
之后在/home/user/data/cityscapesScripts/
下clone Cityscapes Dataset
的脚本代码:
git clone https://github.com/mcordts/cityscapesScripts.git
完成clone之后的目录结构:
/home/user/data/cityscapesScripts
- cityscapesScripts
- leftImg8bit
- gtFine
- tfrecord
将clone的脚本代码里面的createTrainIdLabelImgs.py
复制一份到目录/home/user/data/cityscapesScripts/
下,修改以下代码为自己的路径:
cityscapesPath = '/home/user/data/cityscapesScripts'
将数据集转换成tensorflow训练需要的tfrecord
格式。
在目录models/research/deeplab/datasets
中,修改convert_cityscapes.sh
文件:
CITYSCAPES_ROOT="/home/user/data/cityscapesScripts"
...
python "${CITYSCAPES_ROOT}/createTrainIdLabelImgs.py"
之后运行convert_cityscapes.sh
文件,生成tfrecord
:
sh convert_cityscapes.sh
生成的tfrecord
在目录/home/user/data/cityscapesScripts/tfrecord
下:
首先下载预训练权重xception_cityscapes_trainfine
:http://download.tensorflow.org/models/deeplabv3_cityscapes_train_2018_02_06.tar.gz (更多预训练权重请参见 https://github.com/tensorflow/models/blob/master/research/deeplab/g3doc/model_zoo.md )
在目录models/research/deeplab//backbone/deeplabv3_cityscapes_train
下解压预训练权重。
然后编辑训练指令:
python deeplab/train.py \
--logtostderr \
--training_number_of_steps=1000 \
--train_split="train" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--train_crop_size=513 \
--train_crop_size=513 \
--train_batch_size=2 \
--dataset="cityscapes" \
--tf_initial_checkpoint='/home/user/models/research/deeplab/backbone/deeplabv3_cityscapes_train/model.ckpt' \
--train_logdir='/home/user/models/research/deeplab/exp/train_on_train_set/train' \
--dataset_dir='/home/user/data/cityscapesScripts/tfrecord'
在目录/home/user/models/research
下运行以上指令。
Finished training:
INFO:tensorflow:global step 970: loss = 0.4412 (0.504 sec/step)
INFO:tensorflow:global step 980: loss = 0.4016 (0.464 sec/step)
INFO:tensorflow:global step 990: loss = 0.6123 (0.520 sec/step)
INFO:tensorflow:global step 1000: loss = 0.3552 (0.492 sec/step)
INFO:tensorflow:Stopping Training.
INFO:tensorflow:Finished training! Saving model to disk.
编辑验证指令:
python deeplab/eval.py \
--logtostderr \
--eval_split="val" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--eval_crop_size=1025 \
--eval_crop_size=2049 \
--dataset="cityscapes" \
--checkpoint_dir='/home/user/models/research/deeplab/exp/train_on_train_set/train' \
--eval_logdir='/home/user/models/research/deeplab/exp/train_on_train_set/eval' \
--dataset_dir='/home/user/data/cityscapesScripts/tfrecord'
在目录/home/user/models/research
下运行以上指令。
Finished evaluation:
INFO:tensorflow:Starting evaluation at 1995-04-01-08:49:56
INFO:tensorflow:Evaluation [50/500]
INFO:tensorflow:Evaluation [100/500]
INFO:tensorflow:Evaluation [150/500]
INFO:tensorflow:Evaluation [200/500]
INFO:tensorflow:Evaluation [250/500]
INFO:tensorflow:Evaluation [300/500]
INFO:tensorflow:Evaluation [350/500]
INFO:tensorflow:Evaluation [400/500]
INFO:tensorflow:Evaluation [450/500]
INFO:tensorflow:Evaluation [500/500]
INFO:tensorflow:Finished evaluation at 2019-04-01-08:52:12
miou_1.0[0.725369751]
如果之后显示INFO:tensorflow:Waiting for new checkpoint at /home/user/models/research/deeplab/exp/train_on_train_set/train
则直接跳出不用理睬。
编辑可视化指令:
python deeplab/vis.py \
--logtostderr \
--vis_split="val" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--vis_crop_size=1025 \
--vis_crop_size=2049 \
--dataset="cityscapes" \
--colormap_type="cityscapes" \
--checkpoint_dir='/home/user/models/research/deeplab/exp/train_on_train_set/train' \
--vis_logdir='/home/user/models/research/deeplab/exp/train_on_train_set/vis' \
--dataset_dir='/home/user/data/cityscapesScripts/tfrecord'
在目录/home/user/models/research
下运行以上指令。
开始可视化:
INFO:tensorflow:Restoring parameters from /home/user/models/research/deeplab/exp/train_on_train_set/train/model.ckpt-1000
INFO:tensorflow:Visualizing batch 1 / 500
INFO:tensorflow:Visualizing batch 2 / 500
INFO:tensorflow:Visualizing batch 3 / 500
INFO:tensorflow:Visualizing batch 4 / 500
...